Identifying Duplicate Quora Question Pairs (Kaggle Competition Bronze Medal Winner)

  • We explored the current methods in NLP, including word2vec embedding (gensim package in python), LSTMs(use keras neural networks API), tf-idf, python nltk package, etc.
  • We built machine learning models which identified duplicate Quora question pairs with high accuracy (logloss ~0.151)
  • We are ranked top 8% in this Kaggle …
more ...

How I Build My First Pelican Blog

After completed several data science projects, I am eager to document them and share them with people. It took me several days to research, set up and write my blog, but I feel it can be much easier and faster to build a Pelican blog, so I am sharing with …

more ...