Course Builder: Migration Scripts from Sanity to Course Builder for Epic Web

TL;DR: Moving Epic Web content from Sanity to Course Builder using scripts to ensure content is correctly structured in the new system.

We are moving Epic Web to Course Builder while using the same database. We are removing Sanity and need to move content types into our database with the current skill stack. This includes workshops, lessons, sections, products, articles, tips, tutorials, and contributions.

Scripts in the Course Builder repo handle this migration by transferring content and ensuring the correct association of resources. For articles, we skip duplicates, transform the data, and ensure proper shaping. Tips deal with video resources and transcripts, using coalesce to handle missing data. Tutorials involve complex modules with sections and lessons that are sorted and indexed.

Workshops are like tutorials but linked to for-sale products. The migration process ensures all data, including video resources and transcripts, is correctly moved into the Course Builder database.

Database tables and associations for the new content structure


[00:00] We are migrating EpicWeb over to CourseBuilder. It's gonna actually use the same database, but we're eliminating sanity, and that means that all these different content types that we present need to be migrated into our database where in the current skill stack database which is represented over here but they don't exist so over here we see those are content resource the content resource resource meaning associations of content resources that belong to others. So a workshop has many lessons or sections. So those are associated with the content resource, resource join table. We can also associate them with the product that's coming up too.

[00:46] And then we have contributions for authors and stuff so these are the three primary tables that are needed to kind of mirror what we're doing over. Product also comes in so we add fields to them because we have the idea of the sanity product where it has metadata around the product. So those will come over eventually. It kind of looks like this was the set I've got through workshops so far. Let's see.

[01:14] So over here I am in the course builder, repo, apps, epic web, inside of source. I'm using scripts and we have migrate all which is everything and this is all of the stuff. So migrate articles, migrate tips, tutorials, workshops, contribution types, all those. So I can come down here and migrate all and that is going to go through and just to let that run down here. Articles is running first and this is like there's the article schema So this is the schema of the thing coming in from Sanity.

[01:55] I did use the CoGPT to make a lot of these. It was pretty handy to take like results from the actual sanity. You can see down here it started doing collections. I can't drag. There it is.

[02:09] So it's doing sections and the individual resources within those sections. Workshops, a lot of those. So we have the article schema and we come in here and we're gonna load the article with the... This is just a fetch implementation that we have. It grabs the contributors to and then just cycles through and it's trying...

[02:30] If it finds it it's not... If it exists we're skipping it So we're not rewriting over things that already exist that have that ID. Goes through, I use the article schema and this is the schema that actually exists in here. The product, lib, articles. So this is the article schema.

[02:51] This is just kind of the way it's set up on all of these course builder apps. Though this isn't something that we're using or it's in the app specific so this could change per app but right now they're all sharing the same shape and we'll probably standardize that and some of these common types will move over to core instead of being on the individual level and we can just stick with that and override it as needed to so that's the whole thing. Let's see, so transforms the article, we get the shape we want, make sure we have all the images, populate those fields, so that's the content resource. Fields is an arbitrary JSON object, so we have the ID and type and time metadata, and then the fields actually dictate the shape of the specific piece of content. So that gets inserted, those are the simplest.

[03:42] The tips get a little more interesting, so same kind of thing but we have a video resource. So bringing those in, and that means that we need the transcripts, and sometimes those are in casting words in the database. So I used Coalesce, which means if this exists, use this, and if it doesn't, use the transcript. So we get those, it basically does the same thing, But we have to write the video resource schema first, then we do, so here's where that gets written. Then we have the tip schema, and then we create that content resource resource where we are associating the video resource with our new tip and then finally pumping the tip into the database.

[04:23] Tutorials a little more complex. This is a module so this is going to have sections and explainers and lessons and solutions and all those different sorts of fun things. I don't need to write that anymore. I was writing to the file system just to get my JSON out of that so I could check it out. So I parse it just to test it, make sure that works.

[04:46] It's using a module schema which I generated so this didn't exist in our system. We do have a tutorial schema here which required a little bit of finagling and I probably should have just used that to be honest but I didn't. So use the generated one for the stuff coming in and this works so I've tested it. It comes in, records the contribution, so this is the instructor in this case. One of the things is in articles, The contribution I think also gets recorded but let's see I think that defaulted to it defaults to author I should just make that explicit.

[05:26] There we go. Alright, yeah, so tutorials comes through does the same thing, migrates sections, you can see all that. They indent a little bit with a little tab here just to see it. Created by, right now we don't have any blended ones, so created by, and this is not who owns it, this is who actually added it to the database, and I'm just going to attribute it to the user there for now or this is actually my user id. Finally rewrite the content resource so if we're creating sections which we are here if there's more than one we create them if there's not we don't and that gets dumped into, it uses the section or just into the root.

[06:06] We keep it track of an index of these lessons because we're gonna want to sort them so it's as we cycle through them we're just iterating on that so we can sort by that And that actually goes into the association when we write the, down here, the resource resource position. We're using less than index and then incrementing that so the next loop it'll stack them up. Alright, sections is actually the same way when we do that. We'll have to do a section index because those have to be sorted as well. And these are the objects that we see in the UI that actually get sorted.

[06:39] It's the resource, resource. Finally the tutorial gets added. Workshops are dead the same as tutorials. There is no real difference. The main difference being a workshop is associated with a product because it's for sale.

[06:53] So I changed the words but I didn't have to change anything in here. And now if we come over to here, I'll refresh, and we'll see there's all of our workshops and all of the video resources are individually in here and the transcripts are associated with them and everything is moved into a shape that... Oops... Let's see... And these are...

[07:18] This is... This particular deployment is running off of this database, so this is the course builder branch of the database. So this is the migration I just ran. And stuff works. Tutorials did not.

[07:30] I don't know. I'll fix it. But that's the gist of this migration so far.