Nearly every electronic gizmo on the planet and beyond creates data about us — what we do, and where, when and how we do it. And sometimes why we do it, too.
It’s a massive, unimaginable amount of data.
In a first-of-its-kind partnership with University of the Pacific’s Master of Science in Data Science program, the San Joaquin Council of Governments (SJCOG) is trying to better understand a sliver of that data gathered from its EZHub, a one-stop shop payment mobility application found by downloading the Vamos transit trip planner. The app generates a large amount of data as it allows users to plan trips via Vamos and buy tickets using EZHub across routes serviced by the San Joaquin Regional Transit District, Altamont Corridor Express, and transit providers in Escalon, Ripon, Manteca, Tracy and Lodi. The data provides a glimpse into the behavior of San Joaquin County transit users and can give SJCOG and its partner jurisdictions the information needed to improve or expand services.
Four Pacific students — Dat Tat Mai, Kunlin Lyu, Yu-Wen Chen and Tan Zheng — looked at the data for their capstone project applying what they learned in the graduate program. Since the app was pulling data from various sources, the team created a single data source to streamline app operations and workflow. The Pacific team also suggested what future data science students can do to expand on the project.
“The students did a great job and we are impressed with all the ideas they had,” SJCOG Executive Director Diane Nguyen said. “We’re going to look very closely at their recommendations to develop improvements for EZHub and get more people to use this technology to purchase tickets, especially now that more and more transit riders are getting back on buses and trains since the pandemic is winding down. We’re very happy with the partnership we have with Pacific’s data science program.”
Students looked at limited data from 2020 and 2021 and were able to glean additional data — the times and locations the app was used, U.S. holidays and weather data — to better focus on patterns and trends.
Some of the insights and recommendations include:
- Despite COVID-19, use of the app increased by 180 percent from 2020 to 2021 and the overall number of active users increased throughout the period.
- The mobile app was used most frequently from 1 p.m. to midnight, especially on Tuesdays and Wednesdays, perhaps because users were arranging for trips the following days. The students suggested advertising during this time to reach more customers.
- Most users were planning their trips at least 36 hours in advance.
- The team recommended building customer profiles with more details for future customer analytics.
- They also recommended separating travel mode and travel time for a more accurate picture of a user’s trip. They made several recommendations on gathering more information on the purchase of tickets.
- And they recommend going with a multiple choice and scaling survey rather than asking users to text responses.
The students gained from the experience and have a better chance to thrive in their future careers after combining their classroom work with their work with EZHub.
“It’s crucially important for our students to work with real-world problems,” Program Director James Hetrick said. “We have many example datasets that we use in class, but it’s essential to see the problems that different organizations are wrestling with in order to see the similarities and the different business problems related to data.”
Pacific student Lyu agreed.
“The real-world data is way more complicated than the sample data set that we used in class,” Lyu said. “It is really important to have such experience that helps us understand the industry better before we get into it. I also learned that the collected data sometimes doesn’t meet your expectation and it might take you some time to collect more data before you can actually perform analysis.”
Hetrick said the program works with various partners such as Schwab, Gallo Wines, nonprofit Central Valey Low Income Housing Corp., SMUD, Sacramento Kings and others to give students broad experiential learning on how organizations use data to improve.
“By seeing different projects across such a rich spectrum of partners, my students are becoming data superheroes,” Hetrick said. “They are ready for anything.”
Looking a real-world data such as that gathered from EZHub can help them gain a foothold on their futures.
“High test scores and impressive GPAs can make me look good in front of graduate schools but real-world experience makes me stand out from other job applicants,” said Mai. “Also, performing well in a real-world data science team requires many skills, which cannot be taught in the majority of master’s degree courses.”
“It was the first time we got chance to manipulate data from real world and apply our skills to help improve an organization’s workflow,” Zheng said. “The feeling of achievement should help me to build my personal confidence for my career.”
The students believe future data science students can take their work and expand on the analysis. Machine learning, a branch of artificial intelligence, automates analytical model building and is based on the idea that such systems can learn from data and patterns to make decisions with little human intervention.
“Because the use of EZHub is expected to grow as the COVID-19 pandemic shifts, this brings new analysis possibilities for future data science students working on this project,” Chen said. “As a result, I’m hoping they can undertake more machine learning analysis to dig deeper into the data and provide more insights to help SJCOG to improve the EZHub mobile ticketing.”
There are plans for future Pacific Master of Science in Data Science program students to continue the work.
The program is located on Pacific’s San Francisco campus in the SOMA district at 155 Fifth St. The university uses a hybrid approach for learning using online and in-person class attendance for this program, which requires four semesters over two academic years for 32 units.
View the EZHub "How To" video in English or Spanish.