Skip to content
Pablo Rodriguez

Collaborative Filtering Programming

Programming Assignment: Collaborative Filtering Recommender Systems

Section titled “Programming Assignment: Collaborative Filtering Recommender Systems”

Objective: Implement collaborative filtering to build a recommender system for movies using real MovieLens dataset.

  • Source: MovieLens “ml-latest-small” dataset
  • Original size: 9000 movies rated by 600 users
  • Reduced dataset: nu = 443 users, nm = 4778 movies
  • Rating scale: 0.5 to 5 in 0.5 step increments
  • Focus: Movies from years since 2000
SymbolDescriptionPython Variable
r(i,j)1 if user j rated movie i, 0 otherwiseR
y(i,j)Rating given by user j on movie iY
w^(j)Parameter vector for user jW
b^(j)Bias parameter for user jb
x^(i)Feature vector for movie iX
nuNumber of usersnum_users
nmNumber of moviesnum_movies
nNumber of featuresnum_features

The collaborative filtering goal is to learn:

  • User parameters: w^(user) and bias for each user
  • Movie features: x^(movie) for each movie
  • Prediction: w^(j) · x^(i) + b^(j) = predicted rating

REQUIRED CODE: cofi_cost_func Implementation

Section titled “REQUIRED CODE: cofi_cost_func Implementation”

You need to implement the collaborative filtering cost function:

def cofi_cost_func(X, W, b, Y, R, lambda_):
"""
Returns the cost for the content-based filtering
Args:
X (ndarray (num_movies,num_features)): matrix of item features
W (ndarray (num_users,num_features)) : matrix of user parameters
b (ndarray (1, num_users) : vector of user parameters
Y (ndarray (num_movies,num_users) : matrix of user ratings of movies
R (ndarray (num_movies,num_users) : matrix, where R(i, j) = 1 if rated
lambda_ (float): regularization parameter
Returns:
J (float) : Cost
"""
nm, nu = Y.shape
J = 0
### START CODE HERE ###
for j in range(nu):
w = W[j,:]
b_j = b[0,j]
for i in range(nm):
x = X[i,:]
y = Y[i,j]
r = R[i,j]
J += np.square(r * (np.dot(w,x) + b_j - y))
J = J/2
# Add regularization
J += (lambda_/2) * (np.sum(np.square(W)) + np.sum(np.square(X)))
### END CODE HERE ###
return J
J = (1/2) * Σ[(i,j): r(i,j)=1] (w^(j)·x^(i) + b^(j) - y^(i,j))²
+ (λ/2) * Σ[w parameters]² + (λ/2) * Σ[x features]²
  • Without regularization (λ=0): Cost = 13.67
  • With regularization (λ=1.5): Cost = 28.09

A vectorized version is provided for efficiency:

def cofi_cost_func_v(X, W, b, Y, R, lambda_):
j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*R
J = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))
return J
# Set parameters
num_features = 100
tf.random.set_seed(1234)
# Initialize variables
W = tf.Variable(tf.random.normal((num_users, num_features), dtype=tf.float64), name='W')
X = tf.Variable(tf.random.normal((num_movies, num_features), dtype=tf.float64), name='X')
b = tf.Variable(tf.random.normal((1, num_users), dtype=tf.float64), name='b')
# Setup optimizer
optimizer = keras.optimizers.Adam(learning_rate=1e-1)
iterations = 200
lambda_ = 1
for iter in range(iterations):
with tf.GradientTape() as tape:
cost_value = cofi_cost_func_v(X, W, b, Ynorm, R, lambda_)
grads = tape.gradient(cost_value, [X,W,b])
optimizer.apply_gradients(zip(grads, [X,W,b]))
if iter % 20 == 0:
print(f"Training loss at iteration {iter}: {cost_value:0.1f}")

You need to set your movie preferences:

my_ratings = np.zeros(num_movies)
# Example ratings (modify these for your preferences)
my_ratings[2700] = 5 # Toy Story 3 (2010)
my_ratings[2609] = 2 # Persuasion (2007)
my_ratings[929] = 5 # Lord of the Rings: Return of the King
my_ratings[246] = 5 # Shrek (2001)
my_ratings[2716] = 3 # Inception
my_ratings[1150] = 5 # Incredibles (2004)
# ... add more ratings based on your preferences
# Make predictions using trained weights
p = np.matmul(X.numpy(), np.transpose(W.numpy())) + b.numpy()
# Restore the mean (denormalize)
pm = p + Ymean
my_predictions = pm[:,0]
# Sort and display top recommendations
ix = tf.argsort(my_predictions, direction='DESCENDING')
for i in range(17):
j = ix[i]
if j not in my_rated:
print(f'Predicting rating {my_predictions[j]:0.2f} for movie {movieList[j]}')

What You Must Implement

  1. Cost Function: Complete the cofi_cost_func() with proper for loops and regularization
  2. Movie Ratings: Set up your personal my_ratings[] array with your preferences
  3. Understanding: Comprehend how the training loop uses TensorFlow’s automatic differentiation

After completing this assignment:

  • Understand collaborative filtering algorithm implementation
  • Experience with TensorFlow custom training loops
  • See practical recommender system in action with real movie data
  • Learn how mean normalization improves new user predictions