Skip to content

Blake Rayvid

Menu
  • About
  • Portfolio
Menu

Sentiment Classifier

Posted on July 12, 2024April 14, 2025 by Blake

My Flatiron School Data Science Bootcamp Phase 3 Project was to address the business problem of brand reputation management by monitoring and analyzing Twitter sentiment. The goal was to develop a machine learning model that can correctly classify tweets as positive, negative, or neutral, and provide insights to improve brand perception and engagement strategies.

View on GitHub

View the slides from my presentation here.

Business Problem

Brand reputation management: Monitor brand perception by correctly classifying new tweets as positive, negative or neutral.

  • Product Improvement: Analyze negative feedback for insights into product weaknesses.
  • Collaboration Opportunities: Identify accounts with consistent positive sentiment for potential collaborations.
  • Product Launch Strategy: Time new product launches during periods of high positive sentiment.

Dataset

The training dataset, sourced from Kaggle, includes:

  • Labels: positive, negative or neutral in the ‘sentiment’ column.
  • 27,000 tweets in the ‘text’ column.
  • The ‘selected_text’ column contains the substring of each tweet relevant to classification.

Results

I tried several model types, and a Support Vector Classifier (SVC) applied to ‘selected_text’ yielded the best performance. Test set results are summarized below, with precision and recall scores per class and a confusion matrix. Test accuracy was 83%.

LabelPrecisionRecall
Negative83%77%
Neutral78%91%
Positive93%80%

Next Steps

  • Semantic Embedding: Experiment with Word2Vec instead of frequency-based TF-IDF for better context understanding.
  • Dimensionality Reduction: Investigate UMAP or t-SNE to improve model performance.
  • Real-time Classification: Deploy the model as a web service for real-time tweet classification.

This project highlights the importance of sentiment analysis in brand reputation management and provides a foundation for further development and deployment in a real-world setting.

Tags: Matplotlib, Pandas, Python, Scikit-Learn

Post navigation

← Skyrim Alchemy Optimizer
TQQQ Trading Algorithm →

Categories

  • Apps
  • Data Science
  • Exploration
  • Finance
  • Health
  • Interactive
  • Optimization
  • Utilities

Tags

D3 Desmos Dotenv JavaScript LLMs MATLAB Matplotlib Netlify Node NumPy Pandas Python QuantConnect Scikit-Learn SciPy Streamlit Vader

© 2025 Blake Rayvid