Everyday analytics at a start up: Power BI for Prism @ Nuonic
I wrote this post back when I worked as a data analyst and then product owner at a small start up called Nuonic.
Who is Nuonic?
Nuonic is an analytics startup that is part of the InLoop group of companies, which also includes Lantern Pay and Flexischools. Nuonic was started as an internal analytics service for group, to generate new value from the data sets being captured through those transactional businesses. Nuonic has since also developed an innovative solution to help commercial vehicle operators improve their Fuel Tax Credit (FTC) processing using GPS data from vehicle tracking systems.
What is Prism?
Prism is a unique and powerful data analytics engine built for analysing GPS data from vehicle fleets. What sets Prism apart is its ability to analyse large amounts of vehicle activity captured by telematics units using advanced analysis methods which would be very difficult to do without specialist skills & tools. This capability is especially useful for FTC calculations, enabling small fleets to use approaches previously only accessible to large ones while adding efficiency and transparency across the board.
Prism has processed over 2 billion (and counting!) GPS data points and nearly 35K vehicles from fleets of all sizes and types so far. It is a fully automated solution that connects with many telematics systems to source GPS data. So far Prism has partnerships with 10 telematics systems that cover more than 50% of the installed telematics user base in Australia with more being added all the time.
How I came to work at Nuonic
I have known Seb (CEO and co-founder) from the analytics community in Brisbane for years now. When I had decided that I wanted to move on from Tabcorp Seb was one of the people I reached out to. That happened to be exactly the time Nuonic were looking to bring someone on board in a product development and analytics role.
Since starting at Nuonic I have been working on Prism helping build out features and the customer base. We’re only a small team but we have big dreams so there’s been a lot of work to get through. Fairly early on I realised I needed to bring Power BI into Nuonic to deal with my quickly growing workload.
Power BI as an operational tool at a start-up
I like to think I use Power BI as an extension of myself. When I first started at Nuonic we only had a handful of subscription clients and only a few retrospective (ad hoc) reporting jobs each month. As a result I did manual checks on everything by simply by running database queries. It was a great way to get started as it forced me straight into our database tables and I got a deep understanding of the data flows in Prism.
As we started to bring on more subscription clients and our retrospective pipeline started to ramp up I saw that I wouldn’t be able to keep pace via manual checks I was doing. I didn’t have the option to only do spot checks either. We need to examine each report to help guide the development of our processing algorithms. Because Prism consumes multiple telematics systems data and standardises them we regularly face complex technical challenges around interpreting the nuances of each GPS tracking device type/service. This requires us to have a lot of visibility on what is going on inside our systems. I had used Power BI a lot at my previous job and seeing as InLoop had an enterprise Office 365 account it was to Power BI that I turned once again.
Reporting QA
Our processing algorithms are evolving constantly not least because of the complexity added by new telematics partners coming on board regularly. We are also regularly adding features to our maps dataset. Seeing as there are many things that can influence reported results, it makes sense to check every report. When we only had a handful of reports each week this was fine to do with direct queries and Excel but as soon as the volume increased we needed something to speed this process up. Ultimately, the plan is to implement bespoke monitoring tools built into the platform, but in the interim we needed a tool that is both fast and flexible and could work with the different types of data going in to and out of the system.
Sense checking the shape of our data
The most basic report check we do is plotting the number of trips we’ve processed against the number of unique vehicle IDs. By looking at the trend over a few months you get a good idea of what normal is for each client. It’s really easy and quick to pick up gaps in the data as well as unusual usage patterns even while still being able to quickly check lots of individual reports.
We also look at the trend of on road and off road usage over each month to see if there’s any unusual behaviour. If we need to we can also dig into individual vehicle breakdowns.
Looking for pass/fail
Each monthly reporting period we have to email an ever increasing number of reports. When we were only sending a few reports we could just eyeball the email logs out of AWS but with the volume we have now that is no longer practical. Now we graph the email logs against our processed data. We want to see IF a client is due a report AND they have a complete month’s worth of data HAVE they been successfully sent a report? The number one thing we’re looking for are those clients who should have been emailed a report but haven’t.
Looking for known issues
We have a couple of known issues we need to monitor for.
It’s possible for duplicate IDs to appear in our data from either the data feeds we are consuming or from our users editing their vehicle telematics IDs (only an issue on some provider platforms). For this we graph a query that looks for duplicate telematics vehicle IDs by Prism client.
We sometimes see vehicle IDs appearing in the GPS data feed that aren’t in the vehicle information feed. For many telematics providers the vehicle information (much like Prism’s vehicle information) is a separate update process to the GPS tracking data and so can be slow to update. Because we need to capture all the movement data for each month’s FTC reports it’s important that we identify and add these missing vehicles quickly. To track this we graph a query comparing the unique list of telematics vehicle IDs in the vehicle register to the same list in the GPS data by Prism client.
In both cases we graph everything so it’s easy to see when there’s action that needs to be taken. Best case is to see zero data in both dashboards.
Looking for unknown issues
As the old saying goes, “you don’t know what you don’t know” and this applies to our systems as much as any other system containing many moving parts.
That doesn’t mean we are completely blind though. One way we diagnose unknown issues is from the various “heartbeats” our system gives off. Three heartbeats we measure are:
- Processing job statistics
- Excluded GPS data
- Cleaned data points
Using the processing job logs out of AWS we graph the processing job statistics per client. We’re looking firstly for failed or hanging jobs and secondly for jobs that completed really quickly. Jobs that completed really fast can suggest that we received a file that was either empty or nearly empty. Because of the range of vehicle fleets we deal with it’s quite possible that a small fleets will see nearly all their jobs complete very fast. However that shouldn’t be the case for large fleets and can alert us to areas to investigate further.
The other heartbeats we track is excluded GPS data. GPS data is no different than any other data in that it often needs cleaning.
We need to clean out GPS points that exceed what would be considered reasonable bounds; e.g. it’s unlikely a vehicle will be travelling at 674km/h on a highway in Australia or take a detour through the Pacific Ocean to get from Sydney and Melbourne. Though there would be less traffic..
Because any given trip has a lot of GPS points excluding a few points makes is generally immaterial to the overall result (which is comprised of hundreds of thousands or millions of points). However if we start to see a lot of points cleaned out of a client’s vehicle movement data then that can indicate that we have an issue with either our algorithms, the GPS data source, or both. Part of our system development task is continuously learning about the causes of these issues and designing solutions for them.
We perform (and track) two cleanings. A cleaning pass on the initial data we receive on a point by point basis and then another cleaning pass after we construct trips to clean out any trips that don’t pass the reasonable bounds tests we have.
For these heartbeats we don’t have a hard line to say that something is right or wrong. We use them rather as a way to keep an eye on our systems to see how they are performing.
Creating a digital assistant
We are a small team we all wear many hats. One of the areas I look after is reporting. Looking after the reporting means I need to be tracking the status of dozens of live jobs. I’m on-boarding new clients as well as delivering jobs constantly and it can get fairly confusing very quickly if I’m not organised.
A few months after starting at Nuonic I found myself spending nearly all day every day trying to track everything I had in flight. That left no time for product development and design (another key part of my job) so I enlisted Power BI to free up some of my time.
Trello is how I track the status of each job. I added client IDs in the card titles and I added the date ranges we needed to deliver on in the descriptions. Client IDs enable me to link Trello data to our other datasets and the date range enables me to assess whether a job is ready to send.
FYI: Trello templates make it even easier to work our board now!
For telematics systems we don’t have an API integration for we’ve set up an AWS S3-based upload portal which connects directly to our processing systems. Our partners and customers are able to use this to deliver data to us. Getting eyes on the status of the upload portal (which files had been uploaded, when, what was connected to Prism and what was missing at any point in time) was important to make this work, so our development team provide event statistics from AWS which I can analyse in PowerBI.
I combine Prism processing statistics, Trello data and the upload portal statistics to form a simple yet powerful tool that tells me when and what I need to action for reporting i.e.
- A client has had data processed and is ready to send a report
- A new bucket has been created but isn't connected to any clients accounts yet
This has saved me buckets (pun-tastic!) of time each week. Without it I would probably be full time delivering reporting with no time for product or market development. In a startup where product & market development is the lifeblood of our business this has been a goldmine for our team.
Power BI for marketing
Internal Marketing
After I’d built our reporting analysis tools in Power BI I turned my attention to monitoring the health of our business over time. We still don’t have a historical record that is easy to extract from our systems, instead I’ve created one using a combination of query snapshots, Excel worksheets and Power BI workbooks.
Every week I post up a company update on the Nuonic Slack showing:
- Change in all time GPS points and vehicle processed
- Change in Subscription clients and vehicles
- Change in client vehicle months across all our categories (Subscription, Retrospective, Sample and Test)
We all have a general feel for the numbers but quantifying them each week has given everyone the same insights and helped us see clearly the impacts of our work. It’s meant we are able to communicate our progress much more easily with the rest of the InLoop group too. Both Lantern Pay and Flexischools are much larger teams with much more mature reporting so it’s been great to share the same level of insights into our operations that both of them do with the rest of the group.
External Marketing
It’s great to have a set of go to statistics to tell the story of your business. After being asked twice for a set of statistics like these for Prism I decided we needed an easily updated source of information we could share with potential clients and partners. The below Power BI workbook table is only a simple set of numbers but it tells the Prism story really well. We’ve crossed 2 billion GPS data points processed and we're well on the way to 3!