Going beyond the eLearning standards with Rustici Generator

Rustici Software has a long history of solving complex problems within eLearning. Whether it is making your LMS SCORM compliant or dispatching courses out to multiple LMSs, we help turn those problems into simple solutions. With that in mind, we’re laying the groundwork for a new product called Rustici Generator. Generator aims to tackle the difficult task of parsing, or scanning, through eLearning and then using AI to generate useful assets from it. Once content is parsed Generator aims to create metadata for the content, provide semantic search of libraries, generate assessments for the content, and further enable our customers to produce personalized learning experiences. Contributing to this new product has been a very rewarding experience, as I’ve gained valuable exposure to the world of AI, built cool things using new languages and tools, and worked with partners to help solve big problems. Although, building this product from the ground up has uncovered many challenges and learning opportunities for our team.

Our team has experience managing and working with a range of eLearning standards. While the many eLearning standards serve a common purpose, the implementation details of each may be a little different. Learning content can be organized in many different ways as the learning standard itself usually sets no expectations. This variance between content written in the various eLearning standards and how learning content packages are designed, is one of Generator’s core challenges.

Making sense of standards implementations

The main information we look for when processing learning content is the “content text”. This is the textual representation of the course – the information that the course intends to communicate to the learner. Deriving this text looks a little different depending on the type of learning content under consideration. For media-style content, such as PDF documents or MP3 files, deriving the content text is rather straightforward – we are able to directly parse text from PDF documents and MP3 files can be provided to various services that will transcribe the media and return the text as captions. Deriving text from packaged eLearning content is a much bigger challenge as the standards do not set expectations on where a particular course will store the content text or detail how the learning content will be organized.

We started by implementing a ‘general’ solution enabling Generator to parse any generic eLearning package. As Generator imports the content, the application walks through the content’s file structure, looking for files and text that we believe contain information that the course intends to communicate to the learner. As we want to be able to handle a wide variance of possible eLearning course structures, this process will occasionally miss out on information that is hidden away in some unexpected files. For best results, we’ve started writing ‘publisher-specific’ parsers that have more rigid expectations for where the critical information is stored in the content package. For example, when importing learning content published by Gomo Learning, we know the textual data for the course is stored in very specific files, and Generator can target those particular files, which makes the import process much cleaner.

Turning text into usable content data

Once Generator has parsed and stored the text for a given piece of learning content, the application can start doing some cool things. By bringing in generative AI tools, Generator can use the course text to generate other metadata properties, such as a ‘summary’, ‘title’, ‘keywords’, and ‘thumbnails’ for that course. With the help of a vector database and a user-supplied Skills Taxonomy, Generator can perform word associations to select a set of skills most associated with the learning content. Generator can also use content text word associations to support semantic search, which can help quickly determine relevant topics for a set of courses. Using the course text, Generator can build multiple-choice questions that can be used to verify a learner understands the course’s key topics.

Generator’s core features expand what one can do with a complex content library. It starts with a broad and complicated problem: understanding the information a course intends to communicate and an understanding of a content library. As we’ve grown and expanded Generator’s content support, we’ve had to tackle the problem head-on. It’s a problem we enjoy working through as we want Generator to simplify interactions with complex content libraries, no matter what content it contains. As we approach the release and launch of Generator, you can watch our webinar to get a deeper look at the application, subscribe to updates as they come out or reach out to our team to learn more.

Going beyond the eLearning standards with Rustici Generator

Making sense of standards implementations

Turning text into usable content data

Related posts

“How can I better organize and search what’s inside my content catalog?”

ATD Event Recap: Back to the Future of Learning

Highlights from LT 2025: AI, Content Portal and the future of L&D

John Griner