Clean Data, Happy Code: A Partial Guide to Validating & Sanitizing User Input in JavaScript with Yup!

Introduction: Why Validate and Sanitize?

Let’s face it: user input can be messy. Much like letting a bunch of rowdy toddlers run around your living room, users can introduce all kinds of unexpected and unruly data into your web application. Without proper checks, that data can muddy up your codebase, break your app’s logic, or even open doors to security vulnerabilities.

As a senior developer who’s been in the trenches for over 15 years, I’ve learned that validating and sanitizing data is essential. Imagine you’re a chef with a kitchen full of fresh ingredients—validating is like checking the quality of the produce before you cook, and sanitizing is like washing your veggies before serving them. In this blog post, I’ll show you how to keep your data squeaky clean by using robust validation techniques (with the help of the Yup library) and rock-solid sanitization strategies (with DOMPurify).

We’ll cover what validation and sanitization actually mean, how they differ, and why it’s crucial to implement them both. I’ll share real-world code samples for email, phone numbers, Git commit hashes, IP addresses, and even those fancy Web3 wallet addresses. By the end, you’ll have the knowledge and tools to ensure every byte of user input that touches your codebase is as pure and safe as a shiny new kitchen utensil.

Clean Data, Happy Code - Featured Image for Blog post about validating with Javascript — Clean Data, Happy Code – Featured Image for Blog post about validating with Javascript

Validation vs. Sanitization: The Dynamic Duo

Before we break out our tools, let’s clarify what validation and sanitization are. Think of validation as the bouncer standing at the club’s entrance, checking IDs, dress codes, and guest lists to ensure only the right folks get in. Sanitization is more like the cleaning crew inside the club—making sure everything stays neat, tidy, and safe once everyone’s dancing around.

Validation: Ensures data meets certain rules before you accept it. Validate early and often—on the client side for a smoother user experience, and on the server side as the last line of defense.
Sanitization: Neutralizes harmful or malicious content before it gets displayed back to users. It’s your secret weapon against cross-site scripting (XSS) attacks and other nasty surprises.

In practice, you’ll validate things like, “Is this a properly formatted email?” or “Does this username only contain letters, numbers, and underscores?” Then, when you display user-generated comments or HTML snippets, you sanitize that output to ensure that <script> tags and sneaky JavaScript event handlers aren’t lurking inside.

Getting Started with Yup: Your Validation Sous-Chef

Enter Yup, a delightful schema-based validation library that makes defining rules for your data feel like reading a simple recipe card. Instead of juggling regular expressions or endless if conditions, you create a schema that says, “This field should be a string, at least 10 characters long, and match a certain pattern.” Yup handles the rest.

Let’s start with a few common validation examples:

Validating an Email Address:

import * as yup from 'yup';

const emailSchema = yup.string().email('Must be a valid email address').required('Email is required');

When you want to confirm an email provided by a user is legit, just feed it into this schema and watch Yup do its magic.

Validating a Phone Number (Digits Only):

const phoneSchema = yup.string()
  .matches(/^[0-9]{10,15}$/, 'Must be a valid phone number with 10 to 15 digits')
  .required('Phone number is required');

A phone number should be just digits, right? With one line of code, you’ve enforced that.

Validating a Git Commit Hash (40-char hex):

const gitHashSchema = yup.string()
  .matches(/^[0-9a-f]{40}$/, 'Must be a 40-character hex string')
  .required('Commit hash is required');

If you’re building tooling around code commits, ensuring your commit hash is correctly formatted is a breeze.

Validating an IP Address:

const ipSchema = yup.string()
  .matches(/^(25[0-5]|2[0-4]\d|[01]?\d?\d)(\.(25[0-5]|2[0-4]\d|[01]?\d?\d)){3}$/, 'Must be a valid IPv4 address')
  .required('IP address is required');

This might look intimidating, but Yup doesn’t bat an eye. It’ll ensure your IPs look like proper IPs before you trust them.

Beyond the Basics: Web3 Wallets and More Complex Schemas

The beauty of Yup is that it can handle nearly anything you throw at it. Let’s say you’re building a dApp (Decentralized Application) and you need to validate an Ethereum wallet address. Typically, Ethereum addresses are 42 characters long, start with 0x, and contain hex characters.

Validating a Web3 Wallet Address:

const ethWalletSchema = yup.string()
  .matches(/^0x[a-fA-F0-9]{40}$/, 'Must be a valid Ethereum wallet address')
  .required('Wallet address is required');

You can also combine multiple fields into one schema, ensuring that a user’s entire form meets your criteria. For example, let’s say we want a user object that requires a username, password, and email, each with their own rules:

Validating a User Registration Form:

const userSchema = yup.object().shape({
  username: yup.string()
    .matches(/^[a-zA-Z0-9_]+$/, 'Username can only contain letters, numbers, and underscores')
    .min(3, 'Username must be at least 3 characters long')
    .max(20, 'Username cannot be longer than 20 characters')
    .required('Username is required'),
  password: yup.string()
    .min(8, 'Password must be at least 8 characters long')
    .required('Password is required'),
  email: yup.string()
    .email('Please provide a valid email')
    .required('Email is required'),
});

const userData = {
  username: 'John_Doe',
  password: 'securePa55',
  email: 'john.doe@example.com',
};

userSchema.validate(userData)
  .then(validData => console.log('Valid user data:', validData))
  .catch(err => console.error('Validation error:', err.errors));

It’s like having a personal kitchen assistant who checks every ingredient before adding it to the dish. With Yup, you can whip up complex validations that reflect real-world business logic, ensuring the data your app receives is fresh, clean, and ready to go.

Sanitizing Output with DOMPurify: Keeping It Sparkling Clean

Validation ensures the data is correct, but what if that correct data still contains malicious HTML? Enter DOMPurify, the dependable janitor that tidies up your HTML before you serve it to the user’s browser.

Think of DOMPurify like a dishwasher: you load it up with user-generated HTML, and it removes any dirt, grime, and suspicious <script> tags. It’s well-maintained, trusted in the community, and can handle just about anything thrown its way.

Example of Sanitizing HTML in the Browser:

import DOMPurify from 'dompurify';

const userInputHTML = `
  <div><strong>Hello</strong> world!</div>
  <img src="http://example.com/image.jpg" onerror="alert('XSS')">
  <script>alert('HACKED!');</script>
`;

const sanitizedOutput = DOMPurify.sanitize(userInputHTML);
// sanitizedOutput now has malicious scripts and attributes removed.

Now, when you display sanitizedOutput on your page, you know no sneaky JavaScript code will run. You can rest easy knowing your users won’t get a nasty surprise while reading other people’s comments or reviewing user-submitted product descriptions.

Server-Side Sanitization: If you need to sanitize HTML on the server (Node.js), just add jsdom:

npm install dompurify jsdom

import { JSDOM } from 'jsdom';
import createDOMPurify from 'dompurify';

const { window } = new JSDOM('');
const DOMPurify = createDOMPurify(window);

const maliciousHTML = `
  <h1>Welcome</h1>
  <script>alert('HACKED!');</script>
`;

const clean = DOMPurify.sanitize(maliciousHTML);
// 'clean' now contains safe HTML.

With DOMPurify, you know the data you’re rendering is safe—no ifs, ands, or <scripts> about it.

Wrapping Up: Bringing It All Together

Congratulations! You’ve now got a rock-solid process for handling user input like a seasoned pro. Just like a chef who carefully selects fresh ingredients and washes them before plating, you’re doing the same for your data: using Yup to validate input (ensuring it’s fresh and fits the recipe) and DOMPurify to sanitize output (ensuring it’s safe to serve).

Validate Early and Often: Use Yup’s schemas to ensure emails are valid, phone numbers are correct, IPs are real, and even fancy stuff like Web3 wallet addresses check out.
Sanitize Before Serving: Cleanse your HTML with DOMPurify to remove dangerous tags and attributes, preventing attackers from slipping malicious code onto your site.

In the 15+ years I’ve been working in this field, I’ve seen firsthand how a careful approach to input validation and output sanitization can save you countless headaches. By investing in these best practices, you’re not only writing more secure and stable code—you’re also giving yourself peace of mind, knowing that your users and your application are well-protected.

Now, go forth and build that dream application. With these techniques under your belt, you’ll keep your data as pure as mountain spring water and deliver a user experience that’s both secure and delightful. Happy coding!