NLP

Natural Language Processing (NLP) is a subfield of Artificial Intelligence that focuses on the interaction between computers and humans using natural language. The goal of NLP is to enable computers to understand, interpret, and generate human language in a way that is meaningful and useful.
NLP involves a variety of tasks, such as text classification, sentiment analysis, named entity recognition, machine translation, question answering, and speech recognition. These tasks are often achieved using machine learning and deep learning techniques, such as neural networks.

NLP has many practical applications, including chatbots, virtual assistants, sentiment analysis for social media monitoring, automated translation services, and text-to-speech systems. NLP is also used in the field of healthcare for clinical decision support and electronic health record management.
As the amount of text data continues to grow, the importance of NLP in business and research is becoming increasingly evident. The development of new algorithms and techniques is advancing the field, and there is a growing need for professionals with NLP expertise in a variety of industries.

Topics

The guessing game is an AI game wherein at the beginning, user is allocated 5 points. The user is presented with a set of blanks that demonstrates the length of the word the user has to guess. User guesses any letters that could be in the word. If the user guessed the letter that is present in the word, a point is added to the user's score. For every incorrect guess, 1 point is deducted from the user's score. If the score becomes zero, the user looses. If the user guesses the whole word correctly, they win.

Topics covered

Natural Language Processing (NLP) fundamentals
Data pre-processing and preparation
Training and testing machine learning models
Performance evaluation and optimization
Removing non-alpha tokens
Removing stopwords
Lemmatizing tokens
POS tagging

| View on GitHub |

A web crawler, also known as a spider, is a software program that systematically browses the World Wide Web to index and retrieve information. The process of web crawling involves automatically following links from one web page to another, and extracting information from each page visited.

Web crawlers are commonly used by search engines to collect data on web pages and create an index that can be used to respond to search queries. The crawler starts with a list of URLs to visit, and then follows the links on each page to discover new URLs to visit. As it visits each page, the crawler extracts information such as the page title, URL, and content. Web crawlers can also be used for other purposes, such as data mining, web scraping, and monitoring changes to a website.

Topics covered

Web scraping and data extraction
Natural Language Processing (NLP) fundamentals
Data pre-processing and preparation
Text analysis and visualization
Filter and Crawl relevant links
Scrape and Clean the web sites data
Calculate TF-IDF
Create Knowledge Base for the chatbot
Storing Knowledge Base to a database

| View on GitHub | Report |

Wordnet and sentwordnet summary

WordNet is a lexical database that organizes English words into groups of synonyms called synsets. Each synset represents a distinct concept, and each word in the synset is considered a possible synonym of the others. WordNet also includes relationships between synsets, such as hyponymy (where one synset is a subtype of another), meronymy (where one synset is a part of another), and antonymy (where one synset is the opposite of another). WordNet is widely used in natural language processing (NLP) and computational linguistics for tasks such as text classification, word sense disambiguation, and information retrieval.

SentiWordNet is an extension of WordNet that assigns sentiment scores to synsets. Each synset is given three scores between 0 and 1, representing the degree of positivity, negativity, and neutrality associated with the concept it represents. These scores are generated by combining the scores of its component words, which are manually annotated with positive and negative sentiment values. SentiWordNet is commonly used in sentiment analysis applications, where it can be used to determine the overall sentiment of a piece of text based on the sentiment scores of its component words.

Topics covered

Lexical semantics and ontologies
Word sense disambiguation
Semantic similarity and relatedness
NLP applications such as text classification and information retrieval
Synset

| View on GitHub | Report |

In natural language processing (NLP), n-grams are contiguous sequences of n items (usually words) in a text. For example, a 2-gram (also called a bigram) might be "natural language," and a 3-gram (also called a trigram) might be "machine learning algorithms."

N-grams are commonly used in NLP for a variety of tasks, such as language modeling, text classification, and information retrieval. One of the main benefits of using n-grams is that they capture the local context of words in a text. For example, a 2-gram like "ice cream" provides more specific information than the individual words "ice" and "cream" would on their own.

Topics covered

Text processing and representation
Language modeling and prediction
Statistical methods in NLP
Applications such as machine translation and speech recognition
Unigram
Bigrams

| View on GitHub | Report |

Sentence parsing, also known as syntactic parsing or parsing, is a natural language processing (NLP) task that involves analyzing the grammatical structure of a sentence to determine its underlying syntactic relationships. The goal of sentence parsing is to produce a structured representation of a sentence that captures its grammatical structure.

There are two main approaches to sentence parsing in NLP: rule-based parsing and statistical parsing. Rule-based parsing involves using a set of predefined rules or grammar to analyze the structure of a sentence. These rules can be based on formal grammars, such as context-free grammars or dependency grammars. Statistical parsing, on the other hand, involves using machine learning algorithms to automatically learn the rules or patterns that govern the syntactic structure of language.

Topics covered

Syntax and grammatical rules in NLP
Parsing algorithms and techniques
Dependency and constituency parsing
Applications such as named entity recognition and sentiment analysis
Constituent parsing
Part-of-speech tagging
Semantic role labeling
PSG Tree
Dependency parsing
SRL Parser

| View on GitHub | Report |

A chatbot, also known as a conversational agent, is a software program designed to simulate human-like conversations with users, either through text or voice interactions. Chatbots use natural language processing (NLP) and artificial intelligence (AI) technologies to understand and respond to user queries and requests. Chatbots can be designed for a variety of applications, such as customer service, e-commerce, entertainment, and personal assistance. They can be deployed on websites, mobile apps, messaging platforms, and voice assistants, such as Amazon Alexa and Google Assistant.

Chatbots can be categorized into two types: rule-based and AI-based. Rule-based chatbots follow predefined rules and scripts to respond to user inputs. They are typically limited in their capabilities and require manual intervention to update or modify their responses.

AI-based chatbots, on the other hand, use machine learning and natural language processing algorithms to understand and generate responses to user queries. They can learn from user interactions and improve their responses over time, without the need for manual intervention.

Topics covered

Natural Language Understanding (NLU)
Intent classification
Named Entity Recognition
Dialogue systems and conversational agents
Language generation and response planning
Evaluation and optimization of chatbots

| View on GitHub | Report |

Adaptive Testing and Debugging of NLP Models is an academic paper that refer to the process of iteratively testing and improving an NLP model based on feedback from users or other sources.

In adaptive testing, the NLP model is modified based on the results of previous tests. For example, if the model consistently fails to correctly identify a certain type of entity, such as a person's name, the model can be adjusted to improve its performance in that area. Adaptive testing can be used to improve the accuracy and efficiency of NLP models over time, as the model is updated to better handle real-world data and use cases.

Debugging NLP models involves identifying and fixing errors or bugs in the model's code or design. This can involve manually reviewing the model's output and identifying areas where it is not performing as expected, or using automated debugging tools to identify and fix issues more quickly.

Both adaptive testing and debugging are important processes in the development and optimization of NLP models, as they help to ensure that the model is accurate, efficient, and effective at handling real-world data and use cases.

Topics covered

Model evaluation and quality assessment
Error analysis and diagnosis
Data-driven approaches to model debugging
Practical tips and tools for debugging NLP models

| Report |

Text classification is the process of assigning predefined categories to text data. In this project, we use the scikit-learn library to perform text classification on a dataset of fake news. The program can be used for various applications such as sentiment analysis, spam detection, and topic modeling.

Topics covered

WordCloud
CountVectorizer
TfidfVectorizer
Naive Bayes
LogisticRegression
MLPClassifier
Deature extraction
Model selection
Evaluation metric: accuracy, precision, recall, f1-score

| View on GitHub | Report |

Keras is a high-level neural network API that allows for rapid prototyping of deep learning models. In the context of NLP, Keras can be used for tasks such as sentiment analysis, named entity recognition, and machine translation.

Topics covered

Deep learning for NLP
Convolutional Neural Networks (CNNs) for text classification
Recurrent Neural Networks (RNNs) for text classification
LSTM for text classification
Embedding
Predefined Embedding
Transfer learning
Model interpretation and visualization

| View on GitHub | Report |

Technical Skills

Programming Languages: Golang, Ruby, Python, Java, JavaScript
Frameworks: Python Flask, Ruby on Rails, Python Django, Angular - 5, React, Android
Server-side Technologies: Jupyter Notebooks, Jekyll, PostgreSQL, RabbitMQ, Redis, MySQL, SQLite
Client-side Technologies: HTML, CSS, JQuery, Bootstrap, Semantic UI, SCSS
Hosting: Amazon Web Services - AWS, Google Cloud Platform - GCP, Heroku
Containerization & Orchestration: - Docker & Kubernetes
Testing Frameworks: RSpec, Mocha-Chai
Version Control Tools: Git
Relevant Coursework: Natural Language Processing, Machine Learning, Bigdata Analytics

Soft Skills

Communication: During team meetings, I listen carefully to your colleague's concerns and suggestions, and then summarize what you heard to ensure that you understood their perspective correctly.
Scheduling: I have several tasks to complete in a sprint, so I prioritize them based on their importance and deadline, and work on the most urgent ones first.
Time and Project Management: I am working on a long-term project, so I break it down into smaller, more manageable goals with clear deadlines, and regularly review my progress to ensure that I are on track.
Analytical Thinking: In a case of any problem in the project, I gather and analyze data, brainstorm potential solutions, and then evaluate and implement the most effective option.
Flexibility: My company had a major reorganization, so I adapt to the new structure and responsibilities, and look for ways to contribute to the team's success in the new environment.
Ability to Work in a Team or Independently: I work on a team project, so you communicate effectively with your colleagues, share ideas, and collaborate on tasks to achieve the project goals. Alternatively, I work independently on a task, but reach out to colleagues when I need help or feedback.

About Me

I am passionate about NLP and constantly seek to learn more about this rapidly changing field. I have plans to work on personal projects that involve updating the chatbot and building a recommendation system for online news articles. To keep up with the latest developments, I read research papers, participate in online communities, and attend conferences and workshops.

I am also interested in possible employment opportunities that allow me to apply my skills and contribute to the NLP community.

Special Thanks

Grateful for Dr. Mazidi, Karen's exceptional dedication to teaching and creating a supportive learning environment.

NLP

Topics

Guessing Game

Topics covered

Web Crawler

Topics covered

WordNet

Topics covered

Ngrams

Topics covered

Sentence Parsing

Topics covered

Chatbot

Topics covered

ACL Paper Summary - Adaptive Testing and Debugging of NLP Models

Topics covered

Text Classification 1 - scikit-learn

Topics covered

Text Classification 2 - Keras

Topics covered

Technical Skills

Soft Skills

About Me

Special Thanks