Feature Engineering

What is Feature Engineering?

Feature engineering involves using domain knowledge and intuition to design new features, usually by transforming or combining original features, to make it easier for learning algorithms to make accurate predictions.

Impact on Performance

The choice of features can have a huge impact on your learning algorithm’s performance. For many practical applications, choosing or engineering the right features is a critical step to making the algorithm work well.

House Price Prediction Example

Original Features

x₁: Width of lot (frontage)
x₂: Depth of lot

Basic Model

f(x) = w₁x₁ + w₂x₂ + b

This model treats width and depth as separate, independent features.

Feature Engineering Insight

Rather than using width and depth separately, create a new feature that combines them:

x₃ = x₁ × x₂ (area of the land)

Enhanced Model

f(x) = w₁x₁ + w₂x₂ + w₃x₃ + b

Now the model can choose parameters to emphasize:

Individual dimensions (frontage, depth)
Combined area measurement
Or any combination based on what the data shows

Feature Engineering Process

Domain Knowledge Application

Real Estate Insight: Area typically matters more than individual dimensions
Mathematical Combination: Multiply related features to create meaningful new ones
Intuitive Features: Create features that make sense for the problem domain

Transformation Examples

Multiplication

Area = width × depth for land size

Addition

Total rooms = bedrooms + bathrooms + other rooms

Ratios

Price per square foot = price / area

Boolean

Has garage = 1 if garage exists, 0 otherwise

Benefits of Feature Engineering

Enhanced Model Flexibility

More Options: Algorithm can choose from original and engineered features
Better Representations: Engineered features may capture relationships more directly
Improved Performance: Often leads to more accurate predictions

Incorporating Expertise

Domain Knowledge: Use understanding of the problem to guide feature creation
Business Logic: Include features that make sense from practical perspective
Intuitive Relationships: Capture known relationships between variables

Domain Knowledge

When to Apply Feature Engineering

Before Training

Data Exploration: Understand relationships between existing features
Domain Research: Learn what experts in the field consider important
Visualization: Plot features to identify potential combinations

Feature Selection Considerations

Relevance: New features should be logically related to target variable
Uniqueness: Avoid creating features that are nearly identical to existing ones
Interpretability: Consider whether engineered features make sense to stakeholders

Iterative Process

Create Features: Based on domain knowledge and intuition
Train Model: Test performance with new features
Evaluate Results: Compare to baseline model
Refine: Adjust or create additional features as needed

Practical Implementation

Code Example

# Original features
width = X[:, 0]
depth = X[:, 1]

# Engineered feature
area = width * depth

# Combined feature matrix
X_engineered = np.column_stack([width, depth, area])

Common Patterns

Multiplicative Features: Combine related measurements
Polynomial Features: Will be covered in next section
Categorical Encoding: Transform categorical variables into numerical features
Time-based Features: Extract day, month, season from dates

Future Learning

Later in the specialization, you’ll learn systematic methods for:

Feature Selection: Choosing which features to include
Model Evaluation: Measuring how well different feature sets perform
Automated Feature Engineering: Techniques for systematic feature creation

Feature engineering represents the intersection of domain expertise and machine learning technique. By thoughtfully combining and transforming features, you can often achieve significant improvements in model performance.