mb23565

Creating A Python CLI to convert CSV Files to XLSX

Blog Post created by mb23565 on Aug 7, 2017

Hello, I recently just had to run a report via Open DB and supply some Learn Information to a vendor that will be creating a PoC. I was tasked on getting this information but the information of course would violate FERPA. So I had to come up with a way to mask the user information to still supply an accurate model but protect the user information at the same time. So I decided to use an old python script that converted CSV files to XLSX and make some modifications.


 

I used openpyxl to create the Excel XLSX files, Docopt to make the CLI creation easy, and some built in core modules: csv, json, and random.


 

The main challenge was that I had two reports from Open Db that correlated between Activity and Gradebook Data by the USER_ID. I ended up doing some Python wizardry and use a global (gasp!! what globals in Python, your fired!) Anyways, yes the use of a global in Python is frowned upon, but hey it got the job done! That’s what re-factoring is for...right?


 

Anyways, I thought I would share this application and share my work and hopefully see how useful this can be. I plan on implementing a way to connect directly to Open Db (Already have it done in another project) and then add in some common functionality: Mongo DB Storage, Better Configuration Methods, etc.


 

You are all welcome to use, share, clone, hack, and destroy this code here:

https://github.com/elmiguel/CSV2XLSX


 

You will just need to query and save your own data sets… Luckily for you, I will share my queries:

 

Activity Accumulator:


 

SELECT AA.PK1
,AA.EVENT_TYPE
,U.USER_ID
,CM.BATCH_UID
,AA.GROUP_PK1
,AA.FORUM_PK1
,AA.INTERNAL_HANDLE
,AA.CONTENT_PK1
,AA.DATA
,AA.TIMESTAMP
,AA.STATUS
,AA.MESSAGES
,AA.SESSION_ID
FROM BBLEARN.ACTIVITY_ACCUMULATOR AA
INNER JOIN BBLEARN.USERS U
ON AA.USER_PK1 = U.PK1
INNER JOIN BBLEARN.COURSE_MAIN CM
ON AA.COURSE_PK1 = CM.PK1
WHERE CM.COURSE_ID IN (‘LIST COURSE_IDS HERE’
)
ORDER BY AA.TIMESTAMP DESC;







 

 

 
  Gradebook Data:


 

SELECT
CM.COURSE_ID,
U.USER_ID,
U.FIRSTNAME,
U.LASTNAME,
GM.TITLE,
GM.DUE_DATE,
A.SCORE,
CASE A.STATUS
WHEN 1 THEN 'NOT_ATTEMPTED'  
WHEN 2 THEN 'ABANDONED'
WHEN 3 THEN 'IN_PROGRESS'  
WHEN 4 THEN 'SUSPENDED'
WHEN 5 THEN 'CANCELLED '
WHEN 6 THEN 'NEEDS_GRADING'  
WHEN 7 THEN 'COMPLETED'
WHEN 8 THEN 'IN_MORE_PROGRESS'  
WHEN 9 THEN 'NEEDS_MORE_GRADING'
ELSE 'NO STATUS'
END AS STATUS,
A.ATTEMPT_DATE,
A.FIRST_GRADED_DATE,
A.LAST_GRADED_DATE,
A.DATE_ADDED,
A.DATE_MODIFIED,
A.LATEST_IND
-- A.STUDENT_SUBMISSION
-- A.PK1 ATTEMPT_PK1,
-- A.QTI_RESULT_DATA_PK1,
-- GG.GRADEBOOK_MAIN_PK1,
-- GG.PK1 GG_PK1
FROM GRADEBOOK_MAIN GM  
INNER JOIN GRADEBOOK_GRADE GG
ON GG.GRADEBOOK_MAIN_PK1 = GM.PK1
INNER JOIN ATTEMPT A
ON A.GRADEBOOK_GRADE_PK1 = GG.PK1
INNER JOIN COURSE_USERS CU
ON GG.COURSE_USERS_PK1 = CU.PK1
INNER JOIN COURSE_MAIN CM
ON CU.CRSMAIN_PK1 = CM.PK1
INNER JOIN USERS U
ON CU.USERS_PK1 = U.PK1
WHERE CM.COURSE_ID IN (‘LIST COURSE_IDS HERE’);







Outcomes