KAIST GSDS 대학원 박찬영 교수님의 수업인 추천시스템 및 그래프 기계학습 수업 필기입니다.
Goal: Recommend items similar to those the user liked
When is it useful?
-> Useful when ratings of other users are not available
How to defines similarity?
- Profile is vector of features and calculate computing similarity
- How to pick important features?
-> TF-IDF (Term frequency X Inverse Doc Frequency)
Term frequency-Inverse Doc Frequency(TF-IDF)
- Bag-of-Words: Ecach document is represented by a binary vector
idf: 특정 term이 등장한 문서의 수
We can consider the number of accurrences of a term(i.e., Term frequency)
Cons: Order of words/terms is ignored. Recently, word embedding methods are used e.g., word2vec
- Term-document count matrix
tf: 특정 documents에서 특정 term 등장 횟수
We should also consider the document frequency
Content-based approach: Pros/Cons
- Pros
No cold-Start or parsity
Able to recommend to users with unique tastes
Able to recommend new and unpopular items
Able to provide explanations - Cons
Requires content that can be encoded as meaningful features
Difficult to implement seredipity
Easy to overfit
Effective for providing recommendations for new items, but not for new users
Pure content-based systems are rarely found in commercial environments
'RecSys' 카테고리의 다른 글
Collaborative Filtering Recommendation (0) | 2022.09.07 |
---|---|
Introduction to Recommender System (0) | 2022.08.31 |