By Richard Goeke, Ph.D. associate professor of business at
The NAC&U research project into the cost of higher education was invaluable for the students in my Business Analytics course, because it allowed them to work on a problem that they cared about, while at the same time exposing them to the realities of real-world data and research. Business Analytics applies business programming with statistics to Big Data in a structured manner, with the goal of producing insightful results that address a problem. The reality of Analytics, however, is often different than the textbook definition, due to the problems that inevitably beset a research project. Many problems hit us during this NAC&U project, but the students learned that hard work and perseverance can overcome most setbacks.
One of the first problems we encountered was the data itself: the Delta Cost Project dataset contains 900+ separate fields covering the financial and operating dimensions of nearly every post-secondary institution in the United States. This dataset is therefore massive – so large that it couldn’t be uploaded to our class’ data servers. Fortunately, we were able to write programs that separated the private schools into their own dataset, which we were then able to upload to our servers.
With the data now on our servers, the students then learned how difficult real-world research can be. Simply understanding what data was available in the Delta Cost Project dataset required a deep read of the manuals that accompanied the data, and these manuals were both lengthy and complex. Once students isolated the fields they thought would address their chosen research questions, they then had to develop the programs to read in that data. In a couple cases, the data described in the Delta manuals was in fact not provided, so the students learned how deal with missing and inconsistent data.
Once the students got their data input programs working, new problems arose. Syntax and logic errors are inevitable, but the students went further by write the SAS code to test for data normality, produce correlation matrices, and simple regressions. In addition, the students used SAS to produce graphics that meaningfully conveyed the relationships they were measuring. Once the programming was done, the students then needed to understand and interpret their results, so that they could prepare reports and presentations suitable for a business audience manner. Their results were compelling – each of the independent variables they thought would affect tuition in fact did.
Needless to say, the project was a lot of work for the students and for me, but there is no doubt that the students learned a lot about real-world Analytics, and I learned how well students can perform. In the future, I hope to continue working with this dataset, and plan on pushing the students further (e.g. multiple regression and clustering). This project worked very well as a platform by which students could apply Business Analytics principals to a real-world problem, and with good old-fashioned hard-work, produced results that were valid and insightful.