final exam 6100
The back-propagation (BP) algorithm is often used for training feed-forward neural network. why do you need to calculate the gradient in the BP algorithm
Backpropagation is an algorithm that is used for the training of a neural network. In this algorithm, the errors are backpropagated in order to update the weights of the neural layers to get the optimal solution. We need to calculate the gradient of the algorithm to understand the rate of the change in the error from one layer to another. On the basis of this gradient, the weights are optimized by using this optimization algorithm.
Item-Item CF (Cont'd)
Core: finding items similar to the ones user has already liked.• Finding similarity between items:
Collaborative/Social-filtering system
Idea: users who have expressed similar interests (e.g., ratings) in the past will share common interests in the future- aggregation of consumers' preferences and recommendations to other users based on similarity in behavioral patterns
in a feed-forward neural network trained by BP - during feed-forward from which layer to what layer the input signal is broadcast - what is the back-propagation
In a feed-forward neural network, the inputs are propagated from the input layer towards the output layer and the errors are backpropagated from the output layer towards the input layer. The working of the backpropagation involves the calculation of the error using the cost function or the error function and the propagation of that error from
Item-Item CF
It looks for the items the user has consumed then it finds other items similar to consumed items and recommends accordingly.
Stop words
Many of the most frequently used words in English are worthless in information retrieval and text mining
TWO APPROACHES TO CF
Memory-based - Model-based
memory-based CF 2
Neighborhood Selection
the following are correct about neural networks
Neural networks learn during training by adjusting the weights - If training a network takes long time decreasing learning rate alpha may help - Weights should be initialized at the beginning but not all to zeron batch method average value of the weights is used to modify weights.
memory-based CF 3
Prediction Generation
Content-based system -
Recommendations based on item descriptions/features, and past behavior of target user only- supervised machine learning used to induce a classifier to discriminate between interesting and uninteresting items for the user
What are Recommender Systems (RS)?
Recommender systems are a technological proxy for a social process.- different information discovery model where people try to find other people with similar tastes and then ask them to suggest new things.
in a feed-forward network using a sigmoidel (logistic) activation functions, why might one avoid setting target output values 0.0 and 1.0 and prefer to use 0.1 and 0.9 (or even 0.2 and 0.8)
Sigmoidal function can work with the continuous values pretty well and hence the values such as 0.8, 0.1 should be used instead of 0.0 and 1.0.
when a BP is used , the error function must be differentiable, why
The error function must be differentiable because if it is not differentiable then the gradient of the error will be zero and hence the weights would not be updated properly. In fact, if the gradient is 0 then the weights would not get updated at all and hence the network won't get trained.
Which one is not the correct method to prevent overfitting for neural networks
Use a large training data
memory-based CF step 1
User Similarity Measurement
Word Embeddings
Vector representations of a particular word
Knowledge-based system -
knowledge about users and products used to reason what meets the user's requirements, using discrimination tree, decision support tools, case-based reasoning (CBR)
Corpus
a collection of text documents
term frequency-inverse document frequency
a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.
N-Gram
a sequence of n consecutive words: unigrams, bi-grams (e.g., "machine learning" is a bigram), and tri-grams
Which of the following about neural networks is incorrect?
a) Neural networks learn during training by adjusting the weights. b) Neural networks with linear activation function can be modelled as single layer networks. c) If training a network takes long time decreasing learning rate alpha may help. d) Weights should be initialized at the beginning but not all to zero. e) In batch method average value of the weights is used to modify weights.
Collaborative Filtering (CF)-based RS
also known as Social Filtering• automate the word-of-month recommendation process
Model-based
methods use machine learning or statistical modeling techniques to learn or build a model using a user-item matrix. The model is then used for rating prediction.
Memory-based
methods use the ratings to compute similarities between users or items that are successively exploited to product recommendations
examples of textual documents in digital format
e-mails, corporate Web pages, blogs, doctors' notes, technical papers, incident reports, news stories, and more...
Text Mining and Analytics
ext mining is a process of identifying novel information from a collection of text
bag-of-words model
is a simplified representation of text in natural language processing: a text (e.g., a sentence or a document) is represented as the bag of its words, disregarding grammar and even word order.
Natural language processing (NLP)
is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages
Neighbor Selection
or a given active user a, select correlated users to serve as source of predictions.
Stemming
reducing inflected words to their stem (root form)
motivation for RS
reducing information overload and gaining competitive advantages
Terms
single words or multi-word phrases
items recommended for RS
such as a product, a movie, and a novel- to users based on previously collected user preferences.