Skip to content
Pablo Rodriguez

Feature Engineering

Feature engineering involves using domain knowledge and intuition to design new features, usually by transforming or combining original features, to make it easier for learning algorithms to make accurate predictions.

The choice of features can have a huge impact on your learning algorithm’s performance. For many practical applications, choosing or engineering the right features is a critical step to making the algorithm work well.

  • x₁: Width of lot (frontage)
  • x₂: Depth of lot

f(x) = w₁x₁ + w₂x₂ + b

This model treats width and depth as separate, independent features.

Rather than using width and depth separately, create a new feature that combines them:

x₃ = x₁ × x₂ (area of the land)

f(x) = w₁x₁ + w₂x₂ + w₃x₃ + b

Now the model can choose parameters to emphasize:

  • Individual dimensions (frontage, depth)
  • Combined area measurement
  • Or any combination based on what the data shows
  • Real Estate Insight: Area typically matters more than individual dimensions
  • Mathematical Combination: Multiply related features to create meaningful new ones
  • Intuitive Features: Create features that make sense for the problem domain

Multiplication

Area = width × depth for land size

Addition

Total rooms = bedrooms + bathrooms + other rooms

Ratios

Price per square foot = price / area

Boolean

Has garage = 1 if garage exists, 0 otherwise

  • More Options: Algorithm can choose from original and engineered features
  • Better Representations: Engineered features may capture relationships more directly
  • Improved Performance: Often leads to more accurate predictions
  • Domain Knowledge: Use understanding of the problem to guide feature creation
  • Business Logic: Include features that make sense from practical perspective
  • Intuitive Relationships: Capture known relationships between variables
Domain Knowledge
  • Data Exploration: Understand relationships between existing features
  • Domain Research: Learn what experts in the field consider important
  • Visualization: Plot features to identify potential combinations
  • Relevance: New features should be logically related to target variable
  • Uniqueness: Avoid creating features that are nearly identical to existing ones
  • Interpretability: Consider whether engineered features make sense to stakeholders
  1. Create Features: Based on domain knowledge and intuition
  2. Train Model: Test performance with new features
  3. Evaluate Results: Compare to baseline model
  4. Refine: Adjust or create additional features as needed
feature_engineering.py
# Original features
width = X[:, 0]
depth = X[:, 1]
# Engineered feature
area = width * depth
# Combined feature matrix
X_engineered = np.column_stack([width, depth, area])
  • Multiplicative Features: Combine related measurements
  • Polynomial Features: Will be covered in next section
  • Categorical Encoding: Transform categorical variables into numerical features
  • Time-based Features: Extract day, month, season from dates

Later in the specialization, you’ll learn systematic methods for:

  • Feature Selection: Choosing which features to include
  • Model Evaluation: Measuring how well different feature sets perform
  • Automated Feature Engineering: Techniques for systematic feature creation

Feature engineering represents the intersection of domain expertise and machine learning technique. By thoughtfully combining and transforming features, you can often achieve significant improvements in model performance.