Date Sequence Identification Problem
I have a data table which consists of testing results for multiple locations. The relevant columns are: Master ID, Sample Point ID, and Collection Date. There can be multiple Sample Points per Master ID, and multiple dates per Sample Point ID.
I have a filtered list of Master ID/Sample Point ID. Now I need to find Collection Dates for these sites that match the following parameters:
Each Sample Point needs to have collected in each Calendar Quarter, but the samples must be spaced between 2-4 calendar months apart. Days and years are ignorable.
However, each sample point may have more or less than 4 samples.
I need to identify which site have sampled appropriately (by listing the sample dates), and which are missing data. Ideally I would like to know partially filled sample points, if say that 3 samples fit the criteria.
What is the best way to identify an appropriate sequence of dates for each Sample Point?
Examples:
A March sample and an April sample are not compatible (3 and 4 are not an allowable pair, even though they are in separate quarters, they are outside the 2-4 month range).
A January sample and a June sample are not compatible (1 and 6 are not an allowable pair, as they are outside the 2-4 month range).
A January sample excludes a December sample from being accepted, because they are consecutive calendar months.
A March sample excludes an October sample from being accepted (because they are 5 calendar months apart.
My initial approach was to get an array of the unique month numbers for that master ID & sample point ID, then compare to a table of acceptable sequences to find a match. Then identify which sequence matched to search for results containing those month numbers (for that ID/sample point). But there are 35 possible acceptable sequences, and this brute force started feeling like the wrong approach.
Thanks!
EDITS AS REQUESTED:
Version is Excel 365
Example Source data:
| Master ID | Sample Point ID | Collection Date | Sample ID |
|---|---|---|---|
| 100 | E1 | 2/6/2023 | 1 |
| 100 | E1 | 4/6/2023 | 2 |
| 100 | E1 | 7/21/2023 | 3 |
| 100 | E1 | 10/18/2025 | 4 |
| 100 | E2 | 8/9/2021 | 5 |
| 100 | E2 | 10/28/2024 | 6 |
| 101 | E1 | 1/5/2023 | 7 |
| 101 | E1 | 4/16/2024 | 8 |
| 101 | E1 | 6/9/2024 | 9 |
| 200 | E5 | 1/2/2023 | 10 |
| 200 | E5 | 2/2/2023 | 11 |
| 200 | E5 | 4/6/2022 | 12 |
| 200 | E5 | 8/9/2023 | 13 |
| 200 | E5 | 11/7/2022 | 14 |
| 200 | E2 | 1/2/2023 | 15 |
| 200 | E2 | 2/2/2023 | 16 |
| 200 | E2 | 3/3/2023 | 17 |
| 201 | E11 | 3/6/2021 | 18 |
| 201 | E11 | 5/7/2022 | 19 |
| 201 | E11 | 9/4/2023 | 20 |
| 201 | E11 | 11/17/2024 | 21 |
Example output:
| Master ID | Sample Point ID | Q1 Date | Q2 Date | Q3 Date | Q4 Date | Q1 Sample ID | Q2 Sample ID | Q3 Sample ID | Q4 Sample ID |
|---|---|---|---|---|---|---|---|---|---|
| 100 | E1 | 2/6/2023 | 4/6/2023 | 7/21/2023 | 10/18/2025 | Sample ID's can be found easily with xloopup once dates are identified | |||
| 100 | E2 | missing | missing | 8/9/2021 | 10/28/2024 | ||||
| 101 | E1 | 1/5/2023 | 4/16/2024 | missing | missing | ||||
| 200 | E5 | 1/2/2023 | 4/6/2022 | 8/9/2023 | 11/7/2022 | ||||
| 200 | E2 | 2/2/2023 | missing | missing | missing | ||||
| 201 | E11 | 3/6/2021 | 5/7/2022 | 9/4/2023 | 11/17/2024 |
[link] [comments]
Want to read more?
Check out the full article on the original site