Free Essay

Zzzzz

In: Business and Management

Submitted By sayanray1
Words 7581
Pages 31
Title

stata.com merge — Merge datasets
Syntax
Remarks and examples

Menu
References

Description
Also see

Options

Syntax
One-to-one merge on specified key variables merge 1:1 varlist using filename

, options

Many-to-one merge on specified key variables merge m:1 varlist using filename

, options

One-to-many merge on specified key variables merge 1:m varlist using filename

, options

Many-to-many merge on specified key variables merge m:m varlist using filename

, options

One-to-one merge by observation merge 1:1

n using filename

options

, options

Description

Options

keepusing(varlist) generate(newvar) nogenerate nolabel nonotes update replace noreport force

variables to keep from using data; default is all name of new variable to mark merge results; default is merge do not create merge variable do not copy value-label definitions from using do not copy notes from using update missing values of same-named variables in master with values from using replace all values of same-named variables in master with nonmissing values from using (requires update) do not display match result summary table allow string/numeric variable type mismatch without error

Results

assert(results) keep(results) specify required match results specify which match results to keep

sorted

do not sort; dataset already sorted

sorted does not appear in the dialog box.

1

2

merge — Merge datasets

Menu
Data

>

Combine datasets

>

Merge two datasets

Description merge joins corresponding observations from the dataset currently in memory (called the master dataset) with those from filename.dta (called the using dataset), matching on one or more key variables. merge can perform match merges (one-to-one, one-to-many, many-to-one, and many-tomany), which are often called joins by database people. merge can also perform sequential merges, which have no equivalent in the relational database world. merge is for adding new variables from a second dataset to existing observations. You use merge, for instance, when combining hospital patient and discharge datasets. If you wish to add new observations to existing variables, then see [D] append. You use append, for instance, when adding current discharges to past discharges.
By default, merge creates a new variable, merge, containing numeric codes concerning the source and the contents of each observation in the merged dataset. These codes are explained below in the match results table.
Key variables cannot be strLs.
If filename is specified without an extension, then .dta is assumed.

Options
£

£

Options

keepusing(varlist) specifies the variables from the using dataset that are kept in the merged dataset.
By default, all variables are kept. For example, if your using dataset contains 2,000 demographic characteristics but you want only sex and age, then type merge . . . , keepusing(sex age) . . . . generate(newvar) specifies that the variable containing match results information should be named newvar rather than merge. nogenerate specifies that merge not be created. This would be useful if you also specified keep(match), because keep(match) ensures that all values of merge would be 3. nolabel specifies that value-label definitions from the using file be ignored. This option should be rare, because definitions from the master are already used. nonotes specifies that notes in the using dataset not be added to the merged dataset; see [D] notes. update and replace both perform an update merge rather than a standard merge. In a standard merge, the data in the master are the authority and inviolable. For example, if the master and using datasets both contain a variable age, then matched observations will contain values from the master dataset, while unmatched observations will contain values from their respective datasets.
If update is specified, then matched observations will update missing values from the master dataset with values from the using dataset. Nonmissing values in the master dataset will be unchanged.
If replace is specified, then matched observations will contain values from the using dataset, unless the value in the using dataset is missing.
Specifying either update or replace affects the meanings of the match codes. See Treatment of overlapping variables for details.

merge — Merge datasets

3

noreport specifies that merge not present its summary table of match results. force allows string/numeric variable type mismatches, resulting in missing values from the using dataset. If omitted, merge issues an error; if specified, merge issues a warning.

£

£

Results

assert(results) specifies the required match results. The possible results are

Numeric code 1
2
3

Equivalent word (results)

Description

master using match

observation appeared in master only observation appeared in using only observation appeared in both

observation appeared in both, missing values updated observation appeared in both, conflicting nonmissing values Codes 4 and 5 can arise only if the update option is specified. If codes of both
4 and 5 could pertain to an observation, then 5 is used.
4
5

match update match conflict

Numeric codes and words are equivalent when used in the assert() or keep() options.
The following synonyms are allowed: masters for master, usings for using, matches and matched for match, match updates for match update, and match conflicts for match conflict.
Using assert(match master) specifies that the merged file is required to include only matched master or using observations and unmatched master observations, and may not include unmatched using observations. Specifying assert() results in merge issuing an error if there are match results among those observations you allowed.
The order of the words or codes is not important, so all the following assert() specifications would be the same: assert(match master) assert(master matches) assert(1 3)
When the match results contain codes other than those allowed, return code 9 is returned, and the merged dataset with the unanticipated results is left in memory to allow you to investigate. keep(results) specifies which observations are to be kept from the merged dataset. Using keep(match master) specifies keeping only matched observations and unmatched master observations after merging. keep() differs from assert() because it selects observations from the merged dataset rather than enforcing requirements. keep() is used to pare the merged dataset to a given set of observations when you do not care if there are other observations in the merged dataset. assert() is used to verify that only a given set of observations is in the merged dataset.
You can specify both assert() and keep(). If you require matched observations and unmatched master observations but you want only the matched observations, then you could specify assert(match master) keep(match).

4

merge — Merge datasets

assert() and keep() are convenience options whose functionality can be duplicated using merge directly.
. merge . . . , assert(match master) keep(match) is identical to
. merge . . .
. assert _merge==1 | _merge==3
. keep if _merge==3
The following option is available with merge but is not shown in the dialog box: sorted specifies that the master and using datasets are already sorted by varlist. If the datasets are already sorted, then merge runs a little more quickly; the difference is hardly detectable, so this option is of interest only where speed is of the utmost importance.

Remarks and examples

stata.com

Remarks are presented under the following headings:
Overview
Basic description
1:1 merges m:1 merges
1:m merges m:m merges
Sequential merges
Treatment of overlapping variables
Sort order
Troubleshooting m:m merges
Examples

Overview merge 1:1 varlist . . . specifies a one-to-one match merge. varlist specifies variables common to both datasets that together uniquely identify single observations in both datasets. For instance, suppose you have a dataset of customer information, called customer.dta, and have a second dataset of other information about roughly the same customers, called other.dta. Suppose further that both datasets identify individuals by using the pid variable, and there is only one observation per individual in each dataset. You would merge the two datasets by typing
. use customer
. merge 1:1 pid using other

Reversing the roles of the two files would be fine. Choosing which dataset is the master and which is the using matters only if there are overlapping variable names. 1:1 merges are less common than
1:m and m:1 merges. merge 1:m and merge m:1 specify one-to-many and many-to-one match merges, respectively.
To illustrate the two choices, suppose you have a dataset containing information about individual hospitals, called hospitals.dta. In this dataset, each observation contains information about one hospital, which is uniquely identified by the hospitalid variable. You have a second dataset called discharges.dta, which contains information on individual hospital stays by many different patients. discharges.dta also identifies hospitals by using the hospitalid variable. You would like to join all the information in both datasets. There are two ways you could do this.

merge — Merge datasets

5

merge 1:m varlist . . . specifies a one-to-many match merge.
. use hospitals
. merge 1:m hospitalid using discharges

would join the discharge data to the hospital data. This is a 1:m merge because hospitalid uniquely identifies individual observations in the dataset in memory (hospitals), but could correspond to many observations in the using dataset. merge m:1 varlist . . . specifies a many-to-one match merge.
. use discharges
. merge m:1 hospitalid using hospitals

would join the hospital data to the discharge data. This is an m:1 merge because hospitalid can correspond to many observations in the master dataset, but uniquely identifies individual observations in the using dataset. merge m:m varlist . . . specifies a many-to-many match merge. This is allowed for completeness, but it is difficult to imagine an example of when it would be useful. For an m:m merge, varlist does not uniquely identify the observations in either dataset. Matching is performed by combining observations with equal values of varlist; within matching values, the first observation in the master dataset is matched with the first matching observation in the using dataset; the second, with the second; and so on. If there is an unequal number of observations within a group, then the last observation of the shorter group is used repeatedly to match with subsequent observations of the longer group. Use of merge m:m is not encouraged. merge 1:1 n performs a sequential merge. n is not a variable name; it is Stata syntax for observation number. A sequential merge performs a one-to-one merge on observation number. The first observation of the master dataset is matched with the first observation of the using dataset; the second, with the second; and so on. If there is an unequal number of observations, the remaining observations are unmatched. Sequential merges are dangerous, because they require you to rely on sort order to know that observations belong together. Use this merge at your own risk.

Basic description
Think of merge as being master + using = merged result.
Call the dataset in memory the master dataset, and the dataset on disk the using dataset. This way we have general names that are not dependent on individual datasets.
Suppose we have two datasets, master in memory

on disk in file filename

id

age

id

wgt

1
2
5

22
56
17

1
2
4

130
180
110

We would like to join together the age and weight information. We notice that the id variable identifies unique observations in both datasets: if you tell me the id number, then I can tell you the one observation that contains information about that id. This is true for both the master and the using datasets. Because id uniquely identifies observations in both datasets, this is a 1:1 merge. We can bring in the dataset from disk by typing

6

merge — Merge datasets
. merge 1:1 id using filename in memory in filename.dta master + using merged result

=

id

age

id

wgt

id

age

wgt

1
2
5

22
56
17

1
2
4

130
180
110

1
2
5
4

22
56
17
.

130
180
.
110

(matched)
(matched)
(master only)
(using only)

The original data in memory are called the master data. The data in filename.dta are called the using data. After merge, the merged result is left in memory. The id variable is called the key variable. Stata jargon is that the datasets were merged on id.
Observations for id==1 existed in both the master and using datasets and so were combined in the merged result. The same occurred for id==2. For id==5 and id==4, however, no matches were found and thus each became a separate observation in the merged result. Thus each observation in the merged result came from one of three possible sources:
Numeric
code

Description

master using match

1
2
3

Equivalent word originally appeared in master only originally appeared in using only originally appeared in both

merge encodes this information into new variable merge, which merge adds to the merged result: in memory master +

in filename.dta using merged result

=

id

age

id

wgt

id

age

wgt

_merge

1
2
5

22
56
17

1
2
4

130
180
110

1
2
5
4

22
56
17
.

130
180
.
110

3
3
1
2

Note: Above we show the master and using data sorted by id before merging; this was for illustrative purposes. The dataset resulting from a 1:1 merge will have the same data, regardless of the sort order of the master and using datasets.
The formal definition for merge behavior is the following: Start with the first observation of the master. Find the corresponding observation in the using data, if there is one. Record the matched or unmatched result. Proceed to the next observation in the master dataset. When you finish working through the master dataset, work through unused observations from the using data. By default, unmatched observations are kept in the merged data, whether they come from the master dataset or the using dataset.
Remember this formal definition. It will serve you well.

merge — Merge datasets

7

1:1 merges
The example shown above is called a 1:1 merge, because the key variable uniquely identified each observation in each of the datasets.
A variable or variable list uniquely identifies the observations if each distinct value of the variable(s) corresponds to one observation in the dataset.
In some datasets, multiple variables are required to identify the observations. Imagine data obtained by observing patients at specific points in time so that variables pid and time, taken together, identify the observations. Below we have two such datasets and run a 1:1 merge on pid and time,
. merge 1:1 pid time using filename master + using merged result

=

pid

time

x1

pid

time

x2

pid

time

x1

x2

_merge

14
14
14
16
16
17

1
2
4
1
2
1

0
0
0
1
1
0

14
14
16
16
17
17

1
2
1
2
1
2

7
9
2
3
5
2

14
14
14
16
16
17
17

1
2
4
1
2
1
2

0
0
0
1
1
0
.

7
9
.
2
3
5
2

3
3
1
3
3
3
2

This is a 1:1 merge because the combination of the values of pid and time uniquely identifies observations in both datasets.
By default, there is nothing about a 1:1 merge that implies that all, or even any of, the observations match. Above five observations matched, one observation was only in the master (subject 14 at time
4), and another was only in the using (subject 17 at time 2).

m:1 merges
In an m:1 merge, the key variable or variables uniquely identify the observations in the using data, but not necessarily in the master data. Suppose you had person-level data within regions and you wished to bring in regional data. Here is an example:
. merge m:1 region using filename master + using merged result

=

id

region

a

region

x

id

region

a

x

_merge

1
2
3
4
5
6

2
1
2
3
1
5

26
29
22
21
24
20

1
2
3
4

15
13
12
11

1
2
3
4
5
6
.

2
1
2
3
1
5
4

26
29
22
21
24
20
.

13
15
13
12
15
.
11

3
3
3
3
3
1
2

To bring in the regional information, we need to merge on region. The values of region identify individual observations in the using data, but it is not an identifier in the master data.
We show the merged dataset sorted by id because this makes it easier to see how the merged dataset was constructed. For each observation in the master data, merge finds the corresponding observation in the using data. merge combines the values of the variables in the using dataset to the observations in the master dataset.

8

merge — Merge datasets

1:m merges
1:m merges are similar to m:1, except that now the key variables identify unique observations in the master dataset. Any datasets that can be merged using an m:1 merge may be merged using a
1:m merge by reversing the roles of the master and using datasets. Here is the same example as used previously, with the master and using datasets reversed:
. merge 1:m region using filename master + using merged result

=

region

x

id

region

a

region

x

id

a

_merge

1
2
3
4

15
13
12
11

1
2
3
4
5
6

2
1
2
3
1
5

26
29
22
21
24
20

1
1
2
2
3
4
5

15
15
13
13
12
11
.

2
5
1
3
4
.
6

29
24
26
22
21
.
20

3
3
3
3
3
1
2

This merged result is identical to the merged result in the previous section, except for the sort order and the contents of merge. This time, we show the merged result sorted by region rather than id. Reversing the roles of the files causes a reversal in the 1s and 2s for merge: where merge was previously 1, it is now 2, and vice versa. These exchanged merge values reflect the reversed roles of the master and using data.
For each observation in the master data, merge found the corresponding observation(s) in the using data and then wrote down the matched or unmatched result. Once the master observations were exhausted, merge wrote down any observations from the using data that were never used.

m:m merges m:m specifies a many-to-many merge and is a bad idea. In an m:m merge, observations are matched within equal values of the key variable(s), with the first observation being matched to the first; the second, to the second; and so on. If the master and using have an unequal number of observations within the group, then the last observation of the shorter group is used repeatedly to match with subsequent observations of the longer group. Thus m:m merges are dependent on the current sort order—something which should never happen.
Because m:m merges are such a bad idea, we are not going to show you an example. If you think that you need an m:m merge, then you probably need to work with your data so that you can use a
1:m or m:1 merge. Tips for this are given in Troubleshooting m:m merges below.

Sequential merges
In a sequential merge, there are no key variables. Observations are matched solely on their observation number:

merge — Merge datasets

9

. merge 1:1 _n using filename master + using = merged result x1 x2

x1

x2

_merge

10
30
20
5

7
2
1
9
3

10
30
20
5
.

7
2
1
9
3

3
3
3
3
2

In the example above, the using data are longer than the master, but that could be reversed. In most cases where sequential merges are appropriate, the datasets are expected to be of equal length, and you should type
. merge 1:1 _n using filename, assert(match) nogenerate

Sequential merges, like m:m merges, are dangerous. Both depend on the current sort order of the data. Treatment of overlapping variables
When performing merges of any type, the master and using datasets may have variables in common other than the key variables. We will call such variables overlapping variables. For instance, if the variables in the master and using datasets are master: using:

id, region, sex, age, race id, sex, bp, race

and id is the key variable, then the overlapping variables are sex and race.
By default, merge treats values from the master as inviolable. When observations match, it is the master’s values of the overlapping variables that are recorded in the merged result.
If you specify the update option, however, then all missing values of overlapping variables in matched observations are replaced with values from the using data. Because of this new behavior, the merge codes change somewhat. Codes 1 and 2 keep their old meaning. Code 3 splits into codes
3, 4, and 5. Codes 3, 4, and 5 are filtered according to the following rules; the first applicable rule is used.
5 corresponds to matched observations where at least one overlapping variable had conflicting nonmissing values.
4 corresponds to matched observations where at least one missing value was updated, but there were no conflicting nonmissing values.
3 means observations matched, and there were neither updated missing values nor conflicting nonmissing values.
If you specify both the update and replace options, then the merge==5 cases are updated with values from the using data.

10

merge — Merge datasets

Sort order
As we have mentioned, in the 1:1, 1:m, and m:1 match merges, the sort orders of the master and using datasets do not affect the data in the merged dataset. This is not the case of m:m, which we recommend you never use.
Sorting is used by merge internally for efficiency, so the merged result can be produced most quickly when the master and using datasets are already sorted by the key variable(s) before merging.
You are not required to have the dataset sorted before using merge, however, because merge will sort behind the scenes, if necessary. If the using dataset is not sorted, then a temporary copy is made and sorted to ensure that the current sort order on disk is not affected.
All this is to reassure you that 1) your datasets on disk will not be modified by merge and 2) despite the fact that our discussion has ignored sort issues, merge is, in fact, efficient behind the scenes. It hardly makes any difference in run times, but if you know that the master and using data are already sorted by the key variable(s), then you can specify the sorted option. All that will be saved is the time merge would spend discovering that fact for itself.
The merged result produced by merge orders the variables and observations in a special and sometimes useful way. If you think of datasets as tables, then the columns for the new variables appear to the right of what was the master. If the master data originally had k variables, then the new variables will be the (k + 1)st, (k + 2)nd, and so on. The new observations are similarly ordered so that they all appear at the end of what was the master. If the master originally had N observations, then the new observations, if any, are the (N + 1)st, (N + 2)nd, and so on. Thus the original master data can be found from the merged result by extracting the first k variables and first N observations.
If merge with the update option was specified, however, then be aware that the extracted master may have some updated values.
The merged result is unsorted except for a 1:1 merge, where there are only matched observations.
Here the dataset is sorted by the key variables.

Troubleshooting m:m merges
First, if you think you need to perform an m:m merge, then we suspect you are wrong. If you would like to match every observation in the master to every observation in the using with the same values of the key variable(s), then you should be using joinby; see [D] joinby.
If you still want to use merge, then it is likely that you have forgotten one or more key variables that could be used to identify observations within groups. Perhaps you have panel data with 4 observations on each subject, and you are thinking that what you need to do is
. merge m:m subjectid using filename

Ask yourself if you have a variable that identifies observation within panel, such as a sequence number or a time. If you have, say, a time variable, then you probably should try something like
. merge 1:m subjectid time using filename

(You might need a 1:1 or m:1 merge; 1:m was arbitrarily chosen for the example.)
If you do not have a time or time-like variable, then ask yourself if there is a meaning to matching the first observations within subject, the second observations within subject, and so on. If so, then there is a concept of sequence within subject.

merge — Merge datasets

11

Suppose you do indeed have a sequence concept, but in your dataset it is recorded via the ordering of the observations. Here you are in a dangerous situation because any kind of sorting would lose the identity of the first, second, and nth observation within subject. Your first goal should be to fix this problem by creating an explicit sequence variable from the current ordering—your merge can come later.
Start with your master data. Type
. sort subjectid, stable
. by subjectid: gen seqnum = _n

Do not omit sort’s stable option. That is what will keep the observations in the same order within subject. Save the data. Perform these same three steps on your using data.
After fixing the datasets, you can now type
. merge 1:m subjectid seqnum using filename

If you do not think there is a meaning to being the first, second, and nth observation within subject, then you need to ask yourself what it means to match the first observations within subjectid, the second observations within subjectid, and so on. Would it make equal sense to match the first with the third, the second with the fourth, or any other haphazard matching? If so, then there is no real ordering, so there is no real meaning to merging. You are about to obtain a haphazard result; you need to rethink your merge.

Examples
Example 1: A 1:1 merge
We have two datasets, one of which has information about the size of old automobiles, and the other of which has information about their expense:
. use http://www.stata-press.com/data/r13/autosize
(1978 Automobile Data)
. list make weight

length

1.
2.
3.
4.
5.

Toyota Celica
BMW 320i
Cad. Seville
Pont. Grand Prix
Datsun 210

2,410
2,650
4,290
3,210
2,020

174
177
204
201
165

6.

Plym. Arrow

3,260

170

. use http://www.stata-press.com/data/r13/autoexpense
(1978 Automobile Data)
. list make 1.
2.
3.
4.
5.

Toyota Celica
BMW 320i
Cad. Seville
Pont. Grand Prix
Datsun 210

price

mpg

5,899
9,735
15,906
5,222
4,589

18
25
21
19
35

12

merge — Merge datasets

We can see that these datasets contain different information about nearly the same cars—the autosize file has one more car. We would like to get all the information about all the cars into one dataset.
Because we are adding new variables to old variables, this is a job for the merge command. We need only to decide what type of match merge we need.
Looking carefully at the datasets, we see that the make variable, which identifies the cars in each of the two datasets, also identifies individual observations within the datasets. What this means is that if you tell me the make of car, I can tell you the one observation that corresponds to that car.
Because this is true for both datasets, we should use a 1:1 merge.
We will start with a clean slate to show the full process:
. use http://www.stata-press.com/data/r13/autosize
(1978 Automobile Data)
. merge 1:1 make using http://www.stata-press.com/data/r13/autoexpense
Result
# of obs. not matched from master from using

1
1
0

(_merge==1)
(_merge==2)

matched

5

(_merge==3)

. list make weight

length

price

mpg

1.
2.
3.
4.
5.

BMW 320i
Cad. Seville
Datsun 210
Plym. Arrow
Pont. Grand Prix

2,650
4,290
2,020
3,260
3,210

177
204
165
170
201

9,735
15,906
4,589
.
5,222

25
21
35
.
19

6.

Toyota Celica

2,410

174

5,899

18

_merge matched matched matched master only matched (3)
(3)
(3)
(1)
(3)

matched (3)

The merge is successful—all the data are present in the combined dataset, even that from the one car that has only size information. If we wanted only those makes for which all information is present, it would be up to us to drop the observations for which merge < 3.

Example 2: Requiring matches
Suppose we had the same setup as in the previous example, but we erroneously think that we have all the information on all the cars. We could tell merge that we expect only matches by using the assert option.
. use http://www.stata-press.com/data/r13/autosize, clear
(1978 Automobile Data)
. merge 1:1 make using http://www.stata-press.com/data/r13/autoexpense,
> assert(match) merge: after merge, not all observations matched
(merged result left in memory) r(9); merge tells us that there is a problem with our assumption. To see how many mismatches there were, we can tabulate merge:

merge — Merge datasets
. tabulate _merge
_merge

Freq.

Percent

Cum.

master only (1) matched (3)

1
5

16.67
83.33

16.67
100.00

Total

6

13

100.00

If we would like to list the problem observation, we can type
. list if _merge < 3 make 4.

Plym. Arrow

weight

length

price

mpg

_merge

3,260

170

.

.

master only (1)

If we were convinced that all data should be complete in the two datasets, we would have to rectify the mismatch in the original datasets.

Example 3: Keeping just the matches
Once again, suppose that we had the same datasets as before, but this time we want the final dataset to have only those observations for which there is a match. We do not care if there are mismatches—all that is important are the complete observations. By using the keep(match) option, we will guarantee that this happens. Because we are keeping only those observations for which the key variable matches, there is no need to generate the merge variable. We could do the following:
. use http://www.stata-press.com/data/r13/autosize, clear
(1978 Automobile Data)
. merge 1:1 make using http://www.stata-press.com/data/r13/autoexpense,
> keep(match) nogenerate
Result
# of obs. not matched matched 0
5

. list make 1.
2.
3.
4.
5.

BMW 320i
Cad. Seville
Datsun 210
Pont. Grand Prix
Toyota Celica

weight

length

price

mpg

2,650
4,290
2,020
3,210
2,410

177
204
165
201
174

9,735
15,906
4,589
5,222
5,899

25
21
35
19
18

Example 4: Many-to-one matches
We have two datasets: one has salespeople in regions and the other has regional data about sales.
We would like to put all the information into one dataset. Here are the datasets:

14

merge — Merge datasets
. use http://www.stata-press.com/data/r13/sforce, clear
(Sales Force)
. list region 1.
2.
3.
4.
5.

N Cntrl
N Cntrl
N Cntrl
NE
NE

6.
7.
8.
9.
10.

South
South
South
South
West

11.
12.

West
West

name
Krantz
Phipps
Willis
Ecklund
Franks
Anderson
Dubnoff
Lee
McNeil
Charles
Cobb
Grant

. use http://www.stata-press.com/data/r13/dollars
(Regional Sales & Costs)
. list region 1.
2.
3.
4.

N Cntrl
NE
South
West

sales

cost

419,472
360,523
532,399
310,565

227,677
138,097
330,499
165,348

We can see that the region would be used to match observations in the two datasets, and this time we see that region identifies individual observations in the dollars dataset but not in the sforce dataset. This means we will have to use either an m:1 or a 1:m merge. Here we will open the sforce dataset and then merge the dollars dataset. This will be an m:1 merge, because region does not identify individual observations in the dataset in memory but does identify them in the using dataset.
Here is the command and its result:
. use http://www.stata-press.com/data/r13/sforce
(Sales Force)
. merge m:1 region using http://www.stata-press.com/data/r13/dollars
(label region already defined)
Result
# of obs. not matched matched 0
12

(_merge==3)

merge — Merge datasets

15

. list region 1.
2.
3.
4.
5.

name

sales

cost

Krantz
Phipps
Willis
Ecklund
Franks

419,472
419,472
419,472
360,523
360,523

227,677
227,677
227,677
138,097
138,097

matched matched matched matched matched

(3)
(3)
(3)
(3)
(3)

Anderson
Dubnoff
Lee
McNeil
Charles

532,399
532,399
532,399
532,399
310,565

330,499
330,499
330,499
330,499
165,348

matched matched matched matched matched

(3)
(3)
(3)
(3)
(3)

Cobb
Grant

310,565
310,565

165,348
165,348

matched (3) matched (3)

N Cntrl
N Cntrl
N Cntrl
NE
NE

6.
7.
8.
9.
10.

South
South
South
South
West

11.
12.

West
West

_merge

We can see from the result that all the values of region were matched in both datasets. This is a rare occurrence in practice!
Had we had the dollars dataset in memory and merged in the sforce dataset, we would have done a 1:m merge.

We would now like to use a series of examples that shows how merge treats nonkey variables, which have the same names in the two datasets. We will call these “overlapping” variables.

Example 5: Overlapping variables
Here are two datasets whose only purpose is for this illustration:
. use http://www.stata-press.com/data/r13/overlap1, clear
. list, sepby(id) id seq

x1

x2

1.
2.
3.
4.

1
1
1
1

1
2
3
4

1
1
1
.

1
.
2
2

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

1
2
3
4
5
6

.
.
1
1
.a
.a

1
2
1
2
1
2

11.
12.
13.
14.

3
3
3
3

1
2
3
4

.
.
.
.a

.a
1
.
.a

15.

10

1

5

8

. use http://www.stata-press.com/data/r13/overlap2

16

merge — Merge datasets
. list id 1.
2.
3.
4.

bar

x1

x2

1
2
3
20

11
12
14
18

1
.
.
1

1
1
.a
1

We can see that id can be used as the key variable for putting the two datasets together. We can also see that there are two overlapping variables: x1 and x2.
We will start with a simple m:1 merge:
. use http://www.stata-press.com/data/r13/overlap1
. merge m:1 id using http://www.stata-press.com/data/r13/overlap2
Result
# of obs. not matched from master from using matched 2
1
1
14

(_merge==1)
(_merge==2)
(_merge==3)

. list, sepby(id) id seq

x1

x2

bar

_merge

1.
2.
3.
4.

1
1
1
1

1
2
3
4

1
1
1
.

1
.
2
2

11
11
11
11

matched matched matched matched (3)
(3)
(3)
(3)

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

1
2
3
4
5
6

.
.
1
1
.a
.a

1
2
1
2
1
2

12
12
12
12
12
12

matched matched matched matched matched matched (3)
(3)
(3)
(3)
(3)
(3)

11.
12.
13.
14.

3
3
3
3

1
2
3
4

.
.
.
.a

.a
1
.
.a

14
14
14
14

matched matched matched matched (3)
(3)
(3)
(3)

15.

10

1

5

8

.

master only (1)

16.

20

.

1

1

18

using only (2)

Careful inspection shows that for the matched id, the values of x1 and x2 are still the values that were originally in the overlap1 dataset. This is the default behavior of merge—the data in the master dataset is the authority and is kept intact.

merge — Merge datasets

17

Example 6: Updating missing data
Now we would like to investigate the update option. Used by itself, it will replace missing values in the master dataset with values from the using dataset:
. use http://www.stata-press.com/data/r13/overlap1, clear
. merge m:1 id using http://www.stata-press.com/data/r13/overlap2, update
Result

# of obs.

not matched from master from using matched not updated missing updated nonmissing conflict

2
1
1
14
5
4
5

(_merge==1)
(_merge==2)
(_merge==3)
(_merge==4)
(_merge==5)

. list, sepby(id) id seq

x1

x2

bar

_merge

1.
2.
3.
4.

1
1
1
1

1
2
3
4

1
1
1
1

1
1
2
2

11
11
11
11

matched missing updated nonmissing conflict nonmissing conflict

(3)
(4)
(5)
(5)

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

1
2
3
4
5
6

.
.
1
1
.
.

1
2
1
2
1
2

12
12
12
12
12
12

matched nonmissing conflict matched nonmissing conflict missing updated nonmissing conflict

(3)
(5)
(3)
(5)
(4)
(5)

11.
12.
13.
14.

3
3
3
3

1
2
3
4

.
.
.
.

.a
1
.a
.a

14
14
14
14

matched matched missing updated missing updated

(3)
(3)
(4)
(4)

15.

10

1

5

8

.

master only (1)

16.

20

.

1

1

18

using only (2)

Looking through the resulting dataset observation by observation, we can see both what the update option updated as well as how the merge variable gets its values.
The following is a listing that shows what is happening, where x1 m and x2 m come from the master dataset (overlap1), x1 u and x2 u come from the using dataset (overlap2), and x1 and x2 are the values that appear when using merge with the update option.

18

merge — Merge datasets

id

x1_m

x1_u

x1

x2_m

x2_u

x2

_merge

1.
2.
3.
4.

1
1
1
1

1
1
1
.

1
1
1
1

1
1
1
1

1
.
2
2

1
1
1
1

1
1
2
2

matched missing updated nonmissing conflict nonmissing conflict

(3)
(4)
(5)
(5)

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

.
.
1
1
.a
.a

.
.
.
.
.
.

.
.
1
1
.
.

1
2
1
2
1
2

1
1
1
1
1
1

1
2
1
2
1
2

matched nonmissing conflict matched nonmissing conflict missing updated nonmissing conflict

(3)
(5)
(3)
(5)
(4)
(5)

11.
12.
13.
14.

3
3
3
3

.
.
.
.a

.
.
.
.

.
.
.
.

.a
1
.
.a

.a
.a
.a
.a

.a
1
.a
.a

matched matched missing updated missing updated

(3)
(3)
(4)
(4)

15.

10

5

.

5

8

.

8

master only (1)

16.

20

.

1

1

.

1

1

using only (2)

From this, we can see two important facts: if there are both a conflict and an updated value, the value of merge will reflect that there was a conflict, and missing values in the master dataset are updated by missing values in the using dataset.

Example 7: Updating all common observations
We would like to see what happens if the update and replace options are specified. The replace option extends the action of update to use nonmissing values of the using dataset to replace values in the master dataset. The values of merge are unaffected by using both update and replace.
. use http://www.stata-press.com/data/r13/overlap1, clear
. merge m:1 id using http://www.stata-press.com/data/r13/overlap2, update replace
Result
# of obs. not matched from master from using matched not updated missing updated nonmissing conflict

2
1
1
14
5
4
5

(_merge==1)
(_merge==2)
(_merge==3)
(_merge==4)
(_merge==5)

merge — Merge datasets

19

. list, sepby(id) id seq

x1

x2

bar

_merge

1.
2.
3.
4.

1
1
1
1

1
2
3
4

1
1
1
1

1
1
1
1

11
11
11
11

matched missing updated nonmissing conflict nonmissing conflict

(3)
(4)
(5)
(5)

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

1
2
3
4
5
6

.
.
1
1
.
.

1
1
1
1
1
1

12
12
12
12
12
12

matched nonmissing conflict matched nonmissing conflict missing updated nonmissing conflict

(3)
(5)
(3)
(5)
(4)
(5)

11.
12.
13.
14.

3
3
3
3

1
2
3
4

.
.
.
.

.a
1
.a
.a

14
14
14
14

matched matched missing updated missing updated

(3)
(3)
(4)
(4)

15.

10

1

5

8

.

master only (1)

16.

20

.

1

1

18

using only (2)

Example 8: More on the keep() option
Suppose we would like to use the update option, as we did above, but we would like to keep only those observations for which the value of the key variable, id, was found in both datasets. This will be more complicated than in our earlier example, because the update option splits the matches into matches, match updates, and match conflicts. We must either use all these code words in the keep option or use their numerical equivalents, 3, 4, and 5. Here the latter is simpler.
. use http://www.stata-press.com/data/r13/overlap1, clear
. merge m:1 id using http://www.stata-press.com/data/r13/overlap2, update
> keep(3 4 5)
Result
# of obs. not matched matched not updated missing updated nonmissing conflict

0
14
5
4
5

(_merge==3)
(_merge==4)
(_merge==5)

20

merge — Merge datasets
. list, sepby(id) id seq

x1

x2

bar

_merge

1.
2.
3.
4.

1
1
1
1

1
2
3
4

1
1
1
1

1
1
2
2

11
11
11
11

matched missing updated nonmissing conflict nonmissing conflict

(3)
(4)
(5)
(5)

5.
6.
7.
8.
9.
10.

2
2
2
2
2
2

1
2
3
4
5
6

.
.
1
1
.
.

1
2
1
2
1
2

12
12
12
12
12
12

matched nonmissing conflict matched nonmissing conflict missing updated nonmissing conflict

(3)
(5)
(3)
(5)
(4)
(5)

11.
12.
13.
14.

3
3
3
3

1
2
3
4

.
.
.
.

.a
1
.a
.a

14
14
14
14

matched matched missing updated missing updated

(3)
(3)
(4)
(4)

Example 9: A one-to-many merge
As a final example, we would like show one example of a 1:m merge. There is nothing conceptually different here; what is interesting is the order of the observations in the final dataset:
. use http://www.stata-press.com/data/r13/overlap2, clear
. merge 1:m id using http://www.stata-press.com/data/r13/overlap1
Result
# of obs. not matched from master from using matched 2
1
1
14

(_merge==1)
(_merge==2)
(_merge==3)

merge — Merge datasets

21

. list, sepby(id) id bar

x1

x2

seq

_merge

1.

1

11

1

1

1

matched (3)

2.

2

12

.

1

1

matched (3)

3.

3

14

.

.a

1

matched (3)

4.

20

18

1

1

.

master only (1)

5.
6.
7.

1
1
1

11
11
11

1
1
1

1
1
1

2
3
4

matched (3) matched (3) matched (3)

8.
9.
10.
11.
12.

2
2
2
2
2

12
12
12
12
12

.
.
.
.
.

1
1
1
1
1

2
3
4
5
6

matched matched matched matched matched

13.
14.
15.

3
3
3

14
14
14

.
.
.

.a
.a
.a

2
3
4

matched (3) matched (3) matched (3)

16.

10

.

5

8

1

using only (2)

(3)
(3)
(3)
(3)
(3)

We can see here that the first four observations come from the master dataset, and all additional observations, whether matched or unmatched, come below these observations. This illustrates that the master dataset is always in the upper-left corner of the merged dataset.

References
Golbe, D. L. 2010. Stata tip 83: Merging multilingual datasets. Stata Journal 10: 152–156.
Gould, W. W. 2011a. Merging data, part 1: Merges gone bad. The Stata Blog: Not Elsewhere Classified. http://blog.stata.com/2011/04/18/merging-data-part-1-merges-gone-bad/. . 2011b. Merging data, part 2: Multiple-key merges. The Stata Blog: Not Elsewhere Classified. http://blog.stata.com/2011/05/27/merging-data-part-2-multiple-key-merges/. Nash, J. D. 1994. dm19: Merging raw data and dictionary files. Stata Technical Bulletin 20: 3–5. Reprinted in Stata
Technical Bulletin Reprints, vol. 4, pp. 22–25. College Station, TX: Stata Press.
Weesie, J. 2000. dm75: Safe and easy matched merging. Stata Technical Bulletin 53: 6–17. Reprinted in Stata
Technical Bulletin Reprints, vol. 9, pp. 62–77. College Station, TX: Stata Press.

Also see
[D] append — Append datasets
[D] cross — Form every pairwise combination of two datasets
[D] joinby — Form all pairwise combinations within groups
[D] save — Save Stata dataset
[D] sort — Sort data
[U] 22 Combining datasets…...

Similar Documents

Premium Essay

Health for Youths

...groggy feeling is awful and, when you feel that way, you're not at your best. So if you're not too tired, let's talk about sleep. Why You Need Sleep The average kid has a busy day. There's school, taking care of your pets, running around with friends, going to sports practice or other activities, and doing your homework. Phew! It's tiring just writing it all down. By the end of the day, your body needs a break. Sleep allows your body to rest for the next day. Everything that's alive needs sleep to survive. Even your dog or cat curls up for naps. Animals sleep for the same reason you do — to give your body a tiny vacation. Your Brain Needs Zzzzzs Not only is sleep necessary for your body, it's important for your brain, too. Though no one is exactly sure what work the brain does when you're asleep, some scientists think that the brain sorts through and stores information, replaces chemicals, and solves problems while you snooze. Most kids between 5 and 12 get about 9.5 hours a night, but experts agree that most need 10 or 11 hours each night. Sleep is an individual thing and some kids need more than others. When your body doesn't have enough hours to rest, you may feel tired or cranky, or you may be unable to think clearly. You might have a hard......

Words: 620 - Pages: 3

Free Essay

Student

...22222 2222 222 22 2 11111111 1111111 111111 11111 1111 111 explorer exchange customer cluster nobody codeword codename changeme desktop security secure public system shadow office supervisor superuser share super secret server computer owner backup database lotus oracle business manager temporary ihavenopass nothing nopassword nopass Internet internet example sample love123 boss123 work123 home123 mypc123 temp123 test123 qwe123 abc123 pw123 root123 pass123 pass12 pass1 admin123 admin12 admin1 password123 password12 password1 |......

Words: 1158 - Pages: 5

Free Essay

This Is Not Mine

...nagreply kaya si Joshua? Matignan nga. 2 messages received Aba bongga 2 messages. chloe, janine to.. pde b tau mgusap s monday? tnx. Sender: Janine Bakit kaya? sure! bsta approach m nlng aq ah! bka ksi malimut q eh..hehe.. tc! At isa pang text. chloe.. sna ako nlng ksma mo.. Sender: Joshua Aba, may space space effect pang nalalaman. Uso ba un? Mareplyan na nga lang. Oo nga eh! kung ksma k lng nmin ni sai ang sya cguo lalo! alam m b nagkukulitan kmi knina! ang sya tlga, nhwa ata un sau peo buti nlng ung pgkakulit nia swit d tulad m, brutal..hehe jowk lng peace! syng nilibre nia q kfc bleh! =p Nakakapagod ang araw na ito pero sobrang saya! Parang ayaw ko ng matulog pero hindi ko na kaya ZzZzz.. Chapter 12 ended. Confession Mabilis lumipas ang araw ng linggo ko marahil excited na akong pumasok. Makikita ko nanaman kasi si Cyrus. Hanggang ngayon ay hindi ko parin makalimutan ang mga pangyayari nung sabado, kung paano kami kumanta, nagkulitan, kumain at ang buong pangyayari sa araw na iyon. Ang saya talaga sana ganoon nalang palagi. Sana siya na nga ang superman ng buhay ko. Pagkarating ko ng eskwelahan ay agad naman akong tumungo ng locker na nakasanayan ko na simula ng sinulatan ako ni Superman. Pagkabukas ko ng pinto ng locker ay buti na lamang at may nakita akong papel. Sa wakas sumagot na siya akala ko di ko na siya makakausap uli eh, sa isip isip ko. how’s ur weekend? hope u had a lot of fun (: i really miss...

Words: 33695 - Pages: 135

Premium Essay

English

...Common Uses for the Comma Here are the main ways to use the comma in the average sentence. 1. Use a comma to separate three or more items in a series. T xxxx, yyyy, and zzzzz. The results of the election came in late from Utah, Colorado, and California. Trains, buses, and planes are just a few of the main modes of transportation. 2. Use a comma after the first part of a two-part sentence (compound sentence). T subject verb c.t. , and the subject verb c.t. . The race began on time, and all the fans cheered wildly. s. v. c.t. , conj. s. v. c.t. 3. Use a comma after a brief introductory phrase which begins a sentence. In the first place,…. Secondly,…. Finally,…. On the other hand,…. In spite of this,…. Consequently,…. As a result,…. Nevertheless,…. In 1991,…. Almost no one attended the meeting. In spite of this, the media reported it extensively. Everyone contributed to the food drive. As a result, it was a great success. 4. If you begin a sentence with a When…., If…., Since…., Although…., or Unless…. clause, use a comma at the end of that part of the sentence. When it began to rain, we had to cancel our baseball game. If we raise enough money at our car wash, we can go on a trip to Europe. Since I had perfect attendance, my professor said I did not have to take his final. Although everyone enjoyed the play, the cast members were exhausted at the end. Unless our team wins its next game, we won’t be in the play-offs. 5. Use a comma in front of a short not phrase which......

Words: 1766 - Pages: 8

Premium Essay

Operating Systems

...is the trap door . This problem is created by code inserted into the system by a system programmer to bypass some normal check. For example, a programmer could add code to the login program to allow anyone to log in using the login name “zzzzz” no matter what was in the password file. The normal code in the login program might look something like Fig. 9-10(a). The trap door would be the change to Fig. 9-10(b). What the call to strcmp does is check if the login name is “zzzzz”. If so, the login succeeds, no matter what password is typed. If this trap door code were inserted by a programmer working for a computer manufacturer and then shipped with its computers, the programmer could log into any computer made by his company, no matter who owned it or what was in the password file. The trap door simply bypasses the whole authentication process. while (TRUE) { printf("login: "); get_string(name); disable_echoing(); printf("password: "); get_string(password); enable_echoing(); v = check_validity(name, password); if (v) break; } execute_shell(name); while (TRUE) { printf("login: "); get_string(name); disable_echoing(); printf("password: "); get_string(password); enable_echoing(); v = check_validity(name, password); if (v || strcmp(name, "zzzzz") == 0) break; } execute_shell(name); (a) (b) Figure 9-10. (a) Normal code. (b) Code with a trap door inserted. One way for companies to prevent trap doors is to have code reviews as standard practice. With this technique, once a......

Words: 387392 - Pages: 1550

Premium Essay

Data Modeling

...Winter 2007/08 54 2. The Entity-Relationship Model Introduction Example (1) ER schemas are typically shown in graphical notation: Example (“Classical” ER graphical notation:) Instructor Name Phone teaches No Course Title This mini world talks about instructors and courses. Instructors teach courses. Instructors have a name and a phone number. Courses are enumerated and have titles. Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 55 2. The Entity-Relationship Model Introduction Example (2) Example (A possible state of this mini world) 2 42232 No ddddddd dZZZdd ZdddZZZ hh • ZZZZZZZ hhhh hh , h Title Grust lZZZZZZZZ XML hhhh Name ZZZZZ hhhhhhh dd • VVVVV VVVV ddddddd d rddddPhone VVVV 2 20727 7111 VVVV No ddddddd VVV dddddd • ZZZZZZZZZ ZZZZZ, Title Instructor Course DB1 Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 56 2. The Entity-Relationship Model Introduction The ER model (1) The Entity-Relationship Model is often referred to as a semantic data model, because it more closely resembles real world scenarios than, e.g., the relational model. In the ER model, we model the concept of “Instructors.” In the relational model we deal with names and phone numbers. In the ER model, there is a distinction between entities (objects) and relationships between such entities. In the relational model, both concepts are represented by relations. Marc H. Scholl (DBIS, Uni......

Words: 7024 - Pages: 29

Premium Essay

Cost Accounting

...income be if the company experienced a ?? percent increase in fixed costs and ?? percent increase in sales volume? 09. If multiple products are sold with different contribution margins, can you calculate the breakeven point for the products? 10. What assumptions are made over the relevant range in terms of break-even analysis? 11. At the break-even point the total contribution margin equals what? 12. The following information relates to the Tram Company for the upcoming year including sales, cost of goods sold, gross margin, operating expenses and operating profits. The cost of goods sold includes $XXXXX of fixed manufacturing overhead; the operating expenses include $YYYYY of fixed marketing expenses. A special order offering to buy ZZZZZ units for $PPP per unit has been made to Tram. Fortunately, there will be no additional operating expenses associated with the order and Tram has sufficient capacity to handle the order. How much will operate profits be increased if Tram accepts the special order? 13. Differential costs are defined as? 14. The time from initial research and development to the time that support to the customer ends is the? 15. The price based on customers' perceived value for the product and the price that competitors charge is called? 16. Albany Industries produces two products. Information about the products is as follows in terms of units produced and sold, selling price per unit, and variable costs per unit: The company's fixed costs totaled......

Words: 672 - Pages: 3

Free Essay

Kjgvbjh

...far end of the room and stirred her limbs. With gentle, stealthy grace, the scavenger from the future slid out of the bed and padded to a standstill on the floor. Alone and undaunted, Pinkamena Diane Pie slept like a rock. Her snores filled the childish bedroom like the vibrating engine to a Griffon hovercraft. “Short of death or a coma, it actually makes her look graceful,” the last pony droned. Finally freed from the insanity, she shuffled away from her anchor, opened the door, and hobbled out into the dimly lit hallway of the Pie family residence. Pinkie was left alone, sleeping placidly. Ten seconds passed. Twenty. Thirty. In a rush of hoofsteps, Harmony suddenly charged back in and viciously kicked Pinkie's bed with a grunt. “Snortttt-Zzzzz-Mmmfmmm-Hmmm-Zzzzzz...” Pinkie Pie briefly stirred, then fell into slumber once more. “Whew. Just checking on the 'death or coma' part.” A slightly frazzled pegasus let forth a breath of relief and sauntered back out of the room. ~*~*~*~*~*~*~ Harmony stepped out of the front door of the candle-lit farmhouse. The night was bitterly cold, so that a thin cloud of mist wafted out of the quivering pegasus' lips. She reveled in it. Standing alone—a copper speck against the eternal grayness of the stone plateau—the last pony slumped against the southern wall to the farmhouse and stared out upon a wasteland that reminded her of home. However, there was no ash or snow to blanket this landscape like she was used to. In bizarre......

Words: 520812 - Pages: 2084

Free Essay

Acusacion La Polar

...Copia Formulario Propuesta de Crédito, BBVA Banco BHIF, de fecha 27/12/2005 a Asesoría Galicia Ltda. República de Chile FISCALIA REGIONAL METROPOLITANA CENTRO NORTE Página 176 de 707 FISCALIA DELITOS DE ALTA COMPLEJIDAD uuuuu) Copia Estado de situación, Persona natural, Banco BBVA BHIF, de 28 de noviembre de 2005 de Pablo Alcalde Saavedra. vvvvv) Copia Informe Resumen, Banco BBVA, sobre Pablo Sergio Alcalde Saavedra, de fecha 27/12/2005. wwwww) Copia Informe Resumen, Banco BBVA, sobre María Inés Lagos Salas, de fecha 27/12/2005. xxxxx) Copia Informe Resumen, Banco BBVA, sobre Agrícola Chiñigue, de fecha 27/12/2005. yyyyy) Copia Informe Resumen, Banco BBVA, sobre Asesorías Galicia Limitada, de fecha 27/12/2005. zzzzz) Copia Informe Resumen, Banco BBVA, sobre Inversiones Maipo Ltda., de fecha 27/12/2005. aaaaaa) Copia Formulario Apertura Cuenta corriente comercial en Moneda legal, BBVA Banco BHIF, de fecha 05/01/2004, a Asesorías Galicia Ltda. bbbbbb) Copia manifestación de bienes ante Banco BBVA, fecha 05 de enero de 2004, de asesorías Galicia Ltda. cccccc) Copia Contrato de cuenta corriente bancaria persona jurídica, Banco BBVA, a Asesorías Galicia Ltda. dddddd) Copia Informe de Verificación de Trabajo, Dicom Equifax, 16/01/2004, de Inversiones Galicia Ltda. eeeeee) Copia Inscripción en Registro de Comercio del Conservador de Bienes Raíces de Santiago, fecha 29 de marzo de 2000, de la constitución de la sociedad de responsabilidad......

Words: 291061 - Pages: 1165

Premium Essay

Deloitte

...working on a strategic concept PowerPoint presentation for the next meeting. Ninety-minute PowerPoint presentation. Distribute printouts rather than project the presentation on a screen; seems I get better input about possible changes this way. I always ask the big question: How can we do this more accurately to reflect your business? Go to wrap-up meeting with my manager to report what just transpired with the client. Spend next hour or 90 minutes working on what my manager and I just discussed. Have it ready for delivery the next morning. Check e-mail. I’m outta here. Check into the hotel. Grab a bite and a J&B and soda. Catch rerun of Seinfeld on Channel 13 while doing stretching and yoga. Shower, review tomorrow’s schedule, and set alarm. Zzzzz. 2:30 3:30 5:00 6:00 7:30 8:00 20 The Workplace • Lifestyle, Culture, and Hours • Workplace Diversity The Workplace • Compensation, Vacation, and Perks • Travel • Training • Career Path • Insider Scoop 21 Lifestyle, Culture, and Hours People who prefer to work in a Club Med-like setting should work for, well, Club Med. The consulting life is the consulting life, and Deloitte’s no exception. Expect long hours, many days on the road, constant challenges, big issues, and lots of pressure. According to one insider, “You don’t know all the answers, but people expect you to. It’s fun, stressful, exhilarating, and tiring.” “Right now, I probably average about 45 hours a week,” says one analyst. “I think that’s......

Words: 10686 - Pages: 43