Sometimes I travel; sometimes I draw; sometimes I write
Most times I code
Data Scientist / Engineer
Who I am
I'm a data engineer-turned-scientist. My professional experiences range across data engineering, microservices, search engines, natural language processing, and artificial intelligence. My weapon of choice is Python.
Built test cases for testing web vulnerability scanner using C# .NET
Supported internal procurement web application with Java
Pennsylvania State University, University Park
BSc, Information Science and Techology, Design and Development
Please don't hesitate to reach out if you have any enquiries!
Predictive analytics for social marketing content
I was part of a fantastic startup, working directly with founders David Dundas and Ryan Witt. During my time there, we went through a product pivot, from a self-serve mobile advertising plateform to a social content analytics product. These presented different challenges and allowed me to grow both in breadth and depth.
For the advertising platform, I developed a data pipeline to consume more than 500M events daily. It processed in nearly realtime, with less 1 hour from ingestion to the RESTful API. The endpoints were further designed for efficient reporting and data analysis.
During the pivot to social analytics, I delivered an analytics dashboard MVP within two weeks. The MVP includes an extract-transform-load pipeline for numerous social platforms. The collected social contents were profiled into various parameters for the data science team to model.
Social analytics for Chinese language and market
Bootstrapped SaaS Startup
For my previous startup, I've built a social analytics product with a bespoke NLP engine to analyze the Chinese social media. It collected the social conversations surround a keyword and determined the prevailing emotions.
I developed the core text mining engine to extract sentiment and emotions from text. Working with a language very different from English presented challenges not covered by the popular tools. I was able to explore new techniques for this problem.
The task was further complicated by the data size which was required to ingest. At nearly 1M weekly, there are a lot of data to acquire and process through the language engine. I achieved this by leveraging crawler queues and analytics workers.
To effectively communicate the results, I used visualization tools, from basic trend plot to more complex heat maps. Users are able to drill into the analytics to compare results across social keywords. With responsive design, users can also access it on devices of all sizes.