Skip to content

Blake Rayvid

Menu
  • About
  • Portfolio
Menu

Sentiment Classifier

Posted on July 12, 2024May 12, 2026 by Blake

My Flatiron School Data Science Bootcamp Phase 3 Project was to address the business problem of brand reputation management by monitoring and analyzing Twitter sentiment. The goal was to develop a machine learning model that can correctly classify tweets as positive, negative, or neutral, and provide insights to improve brand perception and engagement strategies.

🔗 View on GitHub

View the slides from my presentation here.

Business Problem

Brand reputation management: Monitor brand perception by correctly classifying new tweets as positive, negative or neutral.

  • Product Improvement: Analyze negative feedback for insights into product weaknesses.
  • Collaboration Opportunities: Identify accounts with consistent positive sentiment for potential collaborations.
  • Product Launch Strategy: Time new product launches during periods of high positive sentiment.

Dataset

The training dataset, sourced from Kaggle, includes:

  • Labels: positive, negative or neutral in the ‘sentiment’ column.
  • 27,000 tweets in the ‘text’ column.
  • The ‘selected_text’ column contains the substring of each tweet relevant to classification.

Results

I tried several model types, and a Support Vector Classifier (SVC) applied to ‘selected_text’ yielded the best performance. Test set results are summarized below, with precision and recall scores per class and a confusion matrix. Test accuracy was 83%.

LabelPrecisionRecall
Negative83%77%
Neutral78%91%
Positive93%80%

Next Steps

  • Semantic Embedding: Experiment with Word2Vec instead of frequency-based TF-IDF for better context understanding.
  • Dimensionality Reduction: Investigate UMAP or t-SNE to improve model performance.
  • Real-time Classification: Deploy the model as a web service for real-time tweet classification.

This project highlights the importance of sentiment analysis in brand reputation management and provides a foundation for further development and deployment in a real-world setting.

Tags: Matplotlib, NLP, Pandas, Python, Scikit-Learn
Categories: Data Science

Post navigation

← Space Debris Exploration
TQQQ Trading Algorithm →

Categories

  • Data Science
  • Exploration
  • Finance
  • Health
  • Interactive
  • Optimization
  • Utilities

Tags

D3.js Desmos Docker Express.js FFmpeg Flask Gemini API Google Maps API HTML/CSS/JS MATLAB Matplotlib Netlify NetworkX Next.js NLP Node.js NumPy P5.js Pandas Pillow PostgreSQL Python QuantConnect Railway React.js Scikit-Learn SciPy TensorFlow Tesseract OCR WeasyPrint YFinance API

© 2026 Blake Rayvid. All rights reserved.