My thesis data has over 1000 households from a 70 page survey with multiple datasets. It is quite large so I cannot do this in Excel.

I am trying to merge the datasets (by variable). However, I am having difficultly, because my data has unique identifiers for households... but in every household there are different people numbered in order... such as 1, 2, 3, 4. The household number appears on all the datasets (sometimes only once and sometimes multiple times depending how many individuals per household answered that part of the survey).

I don't think I should create a random ID for the individuals (before merging) because I would have to do this in multiple datasets and they would not match and merge properly.

Is it possible to create a unique identifier for the individuals, ideally tied to the household number? So if the household ID is 11222 and the individual ID is 5, it would be nice if there was a new variable listed as 11222_5.

Is there a better method to combine these datasets?

Any help is much appreciated!