Posts

Building a basic retrieval based chatbot - Ishika ( part 2)

Image
OVERVIEW: In this tutorial we would give a walkthrough of the code. The libraries that have been used are the scikit learn and numpy. Full code present on github. First we import the following libraries. from sklearn.feature_extraction.text import CountVectorizer from sklearn.metrics.pairwise import cosine_similarity from sklearn.feature_extraction.text import TfidfVectorizer dsadad

Building a Basic Retrieval Based Chatbot - Ishika (Part 1)

Image
Overview: We want to build a basic chatbot which trains on previous messages and responses. In this tutorial we look at the math that we are using to convert the messages and their associated responses into weights using term frequency and inverse document frequency. (tf-idf). Once we have the appropriate weights of words present in messages and responses. We write the messages and responses in vector form of the weight present. We then try to find how similar are these vectors using cosine similarity.   We multiply term-frequency and inverse document frequency to obtain the final weight of the word that would be used to construct the vector.   Cosine Similarity: This is a measure of orientation and not magnitude. The reason we are not considering magnitude of the vectors is because the magnitude can be more depending on the length of the query or response associated but that does not tell us about how similar is the query and the messages that w...