Query Ninja: Working with data from multiple tables (using joins). Part One
This article is part of the Query Ninja section of "The MapInfo Pro" journal. There are many articles that cover everything from the basics of querying data right up through advanced topics such as sub-selecting. In this LI360 community, the Topic "Query Ninja" contains a list.
For those who are new to database concepts, here are three important terms. Many of you will know these things but for some newer users of MapInfo Pro, this info might come in handy.
Table - A set of data in MapInfo Pro. A table may also appear as a layer in a map.
Record - A single row of a table (as appears in a Browser window).
Field - A column in the table (as appears in a Browser window or in the Table Structure dialog box).
Using the SQL capabilities it is possible to work with more than one table at a time. Here are some examples of where this might be helpful.
- A table of medical offices can have aggregated information such as customer counts added to it.
- A table of territories or regions, such as the catchment area served by a particular store can have the total number of customers or total sales aggregated into it.
- A table containing some form of demographic areas can have additional data variables from another table added to it.
There are two different ways tables can be joined. In this article we introduce you to relational joins. A separate article covers geographic joins.
Joining tables where there is a matching key field
You may have tables where there is a field in the table that will help you associate each record. For example, a table of customers may have a field indicating what store or service office they visited. This could be used to link each customer with their store or office.
This is called a relational join. This is because you are relating the two tables together by some common value. MapInfo Pro can also join tables together using the map objects (for example, join where the points from one table fall within the regions from a second table). We will cover this next month.
For each example we will show you a snap shot of the tables and the SQL Select dialog box filled in to accomplish the join.
Relational Join Example: Students and Schools
Remember, to get started you use the Query > SQL Select command.
See below two very simple tables. One table contains students and one contains schools. We will use MapInfo Pro to count up how many students are in each school. Note that the student table has the school ID number in it. And of course, the schools table also has a school ID number. We can use the school IDs to relate the data from the two tables together.
Each number is associated with additional comments below.
- Start by selecting the tables you want to join.
- The join condition will sometimes be filled in for you automatically. MapInfo Pro tries to figure out how the tables are related. You should check it as the software can do this incorrectly. If the join is not put in for you, you will need to add it in yourself. You can type the column names or get them from the Columns drop-down list.
- The second column is from the Aggregates drop-down list. This will count up the number of students.
- In Group by Columns box we specify the column by which the students should be counted up. In this case it is by the schools.
- In Sort by Columns we have specified that the result should be sorted by the count of students in descending order. The keyword "desc" has to be typed in. If you want ascending order you do not need to specify anything, it is the default.
- You can give the result a more friendly table name. If you leave the default value Selection in then you will get Query1, Query2, Query3 etc, depending on how many queries you have created during your session.
Here is a part of the resulting data set. The Count column represents the number of students in the table going to the school.
A little additional technical info
When you use an aggregate function in an SQL query the result will not have any map objects. If you do not specify an aggregate function then the first table's map objects will be included in the resulting query.
Joining more than two tables together is possible. The join condition must specify relationships amongst the tables as follows this example.
Table1.columnX = Table2.columnX AND Table2.columnX = Table3.columnX
What if my data doesn't have matching columns?
In this example we have data in the two tables that acts as a lookup key between the datasets. We can relate the students to what school they go to because the student data has the identifier of the school in it.
What if you have a set of data and you do not have any matching columns. An example could be a table of customer data for which you wish to aggregate the customers into their county or some other geographic area for which no column exists in the customer data. It would be common to lack geographic region information for things that are not a part of a standard address.
In cases like this, it is possible to do a Geographic Join. This is the subject of its own article in this community. You can find it in the Query Ninja topic (see the topic list on the main community page).