Recently I was tasked with evaluating several possible solutions for a SQL Server search engine to search a database of students. For a proof of concept, I had a small snapshot of approximately 200,000 students. I needed to expand this data to simulate a much larger data set of 1,750,000 students for performance and load testing.

Using a Common Table Expression to retrieve the students in random order, I combined data from multiple random students.

In the example below, I  was able to “mix” two random students’ first and last names and student IDs. To combine more students, simply repeat the join to studentCTE.

Using his method, I was able to greatly expand my available data starting from a snapshot of real student data.  In the case of my client, I combined phone area codes, phone numbers (with randomized line number), email domains and address information to provide a greater set of test data.


Like this post? Share it!