We are on Learn SaaS.
We have a student management system that is the source of truth of all student/course enrolments, and a middleware layer that can transform this data into essentially the content of the course_users table in Learn.
At the moment, we synchronise this data as follows:
- change detection triggers whenever a student's enrolments are altered
- the student's enrolments in Learn are queried using the REST API (previously, I believe we used SOAP for this)
- two SIS feed files are created, one for added enrolments, one for removed (actually disabled, to prevent deleting student content)
- the feed files are pushed to Learn SIS endpoints
This maintains approximately real-time synchronisation with whatever the student signs-up for, so we don't have to wait for a daily load. If necessary, we can push a manual reconciliation request to ensure that a particular user or subject full of users is immediately put through this process.
There are several problems with this:
- At certain times of year, there are many changes. We have about 30k detected changes queued at the moment because student services have just opened up class registrations. We only have 14k students, but many of them are enrolling in multiple subjects, and selecting lectures/tutorials within them (which correspond to course groups in Learn).
- Most feed files are only 1 row long. So the ActiveMQ broker has to deal with ~30k tiny files waiting to be processed, which can't be an efficient way of going about things.
- Occasionally SIS breaks - on three occasions this year, the endpoint has accepted feedfiles with success messages and then the feedfiles are somehow lost, leading to a real headache trying to figure out who we need to re-reconcile. The error messages mentioning a 'read-only filesystem' are damning (this happens when a linux filesystem falls over). It probably happens because so many files are being written/read/erased.
- Because of the numbers involved and the inefficiency of the processes, sometimes the 'real-time' synchronisation is more like several hours out of date. So a REST query may think something needs changing when it's already been queued to be changed.
- Even read-only REST queries typically take 200-400 ms, and there are many queries required to find out, for example, the course groups that a user is a member of. So some of the triggered reconciles can take many seconds to execute. Multiply this by 30k triggered reconciles and it takes days to clear the backlog.
- Because SIS endpoints are asynchronous, we would have to make a subsequent query to find out if a particular feedfile has actually finished running, and if any errors occurred. This doesn't happen, although perhaps we should make it happen. In any case, sometimes users get out of sync due to errors, and we don't know that they're out of sync until change detection fires on their data in the student management system.
We could go to REST all the way, though we've been waiting for that API to mature a little - some features like subject copy were only introduced a couple of months ago, and there is no way to make multiple changes or queries with one request (the 'batch' endpoint in the non-public REST API that Ultra uses would be useful). I've been considering writing a B2 to add a few such REST endpoints ourselves, but it would be a bit ugly - I don't believe there's any way of extending the existing REST infrastructure, so we'd have to bundle our own REST engine/authentication.
We have DDA, which is almost-realtime read-only access to the Learn databases, which would dramatically simplify the process (we wouldn't need to do any REST queries at all, and we could batch together all pending enrolment changes). However Blackboard explicitly says not to rely on it for production-critical processes. Again, I'm tempted to write a B2 which simply exposes a read-only SQL endpoint, but perhaps that'd be against the rules? I haven't investigated this.
It's a very cumbersome and fragile process that we've been fighting for a while now, so I'm wondering what other institutions do. Does anyone have any advice?