The talk outlines using the UDP listener for StreamSets to collect packets from the F1 2018 game, writing the packets to Kafka, reading from Kafka and using Groovy to parse the packets, and using the OmniSci JDBC driver to insert the data into one of nine OmniSciDB tables. With this workflow, you have a robust platform for accelerated analytics, using the power of GPUs for fast computation.
In this webinar sponsored by the Open Data Science Conference (ODSC), I outline a brief history of GPU analytics and the problems that using GPU analytics solves relative to using other parallel computation methods such as Hadoop. I also demonstrate how OmniSci fits into the broader GPU-accelerated data science workflow, with examples provided using Python.
Check out the video, grab the Jupyter Notebook from the odscwebinar repo and get started with OmniSci and GPU-accelerated data science!
This talk is from October 2018, and so much has changed in the GOAI/RAPIDS ecosystem that it’s comical to see how much has changed! Regardless, the high-level concepts of how OmniSci works and the concepts behind GPU dataframes (then: pygdf, now: cudf) remain the same, so watching this talk still has value if you are interested in an end-to-end GPU workflow.