Google Analytics Customer Revenue Prediction

  • Analyzed raw log data from GStore (faced to normal customers) to predict revenue per customer.
  • Preprocessed the raw data (2.4GB) based on Pandas and Numpy and made data visualization based on Matplotlib and Seaborn module.
  • Improved the data procession speed based on parallelization using Hadoop and PySpark.
  • Trained model to predict based on LightGBM, XGBoost and CatBoost, assembled these models to improve 20% performance.
Avatar
Xuanhao(Eric) Zhang
Software Engineer@

Half techs combined with half arts.