Power Query: how do I change a row value based on results of a count of all rows?
I am trying to take a table that contains 4 relevant fields (symbol, date, shortfield, and comment) and remove duplicates such that there is one record for each symbol, date, and shortfield.
Where a sym&date&shortfield exists with different non-blank comments, I would like to edit the comment before removing duplicates: if one of those comments is "high", change all comments for that combo to blank. otherwise, change those comments to "low".
Here's my current code:
let Source = Excel.CurrentWorkbook(){[Name="fidstats"]}[Content], #"SelectedTypes" = Table.SelectRows(Source, each ([Type] = "Ind Rank" or [Type] = "Metric" or [Type] = "Ratio Calc")), #"RevisedField" = Table.AddColumn(SelectedTypes, "ShortField", each if [Field] = "Valuation" then "val" else if [Field] = "Quality" then "qual" else if [Field] = "Stability" then "stab" else if [Field] = "Health" then "health" else if Text.Contains([Field],"EPS G") then "eps" else if [Field] = "Proj EPS" then "proj eps" else if [Field] = "Fwd EPS" then "fwd eps" else if Text.Contains([Field],"ROI") then "roi" else if Text.Contains([Field],"Ratio") then "ratio" else if Text.Contains([Field],"D/E") then "d/e" else "", type any), #"RevisedValue" = Table.AddColumn(RevisedField, "RevVal", each if [ShortField]="d/e" and [Type]="Ind Rank" then 1-([Value]/100) else if [Type]="Ind Rank" then [Value]/100 else [Value], type any), #"AddedComment" = Table.AddColumn(RevisedValue, "NewComm", each if [Type]="Ind Rank" then if [RevVal]<.25 then "bad" else if [RevVal]<0.35 then "low" else if [RevVal]>0.66 then "high" else "" else if [Type]="Ratio Calc" then if [RevVal]<0.6 then "bad" else if [RevVal]<1 then "low" else if [RevVal]>1.5 then "high" else "" else if [Type]="Metric" then if [RevVal] is null then "" else if [RevVal]<25 then "bad" else if [RevVal]<40 then "low" else if [RevVal]>70 then "high" else "" else "", type any), #"SelectedComments" = Table.SelectRows(AddedComment, each ([NewComm] <> "")), #"AddedCommRank" = Table.AddColumn(SelectedComments, "CommRank", each if [NewComm]="high" then 1 else if [NewComm]="low" then 2 else if [NewComm]="bad" then 3 else "", type number), #"SortedRank" = Table.Sort(AddedCommRank,{{"CommRank", Order.Ascending}, {"Date", Order.Descending}}), #"RemovedColumns" = Table.SelectColumns(SortedRank,{"Sym", "Date", "NewComm", "ShortField"}), #"RemovedDuplicates" = Table.Distinct(RemovedColumns), #"GroupedFields" = Table.Group(RemovedDuplicates, {"Sym", "Date", "NewComm"}, {{"Fields", each Text.Combine([ShortField],", "), type text}}), #"AddedCommFields" = Table.AddColumn(GroupedFields, "CommFields", each Text.Combine({[NewComm], " ", [Fields]}), type text), #"GroupedMonth" = Table.Group(AddedCommFields, {"Sym", "Date"}, {{"AddedCommFields", each Text.Combine([CommFields],"; "), type text}}), #"CombinedText" = Table.AddColumn(GroupedMonth, "MonthCommFields", each Text.Combine({DateTime.ToText([Date], "MMM"), ": ", [AddedCommFields]}), type text), #"GroupedSym" = Table.Group(CombinedText, {"Sym"}, {{"CombinedText", each Text.Combine([MonthCommFields],"#(lf)"), type text}}) in #"GroupedSym" And here are the current results:
| Sym | CombinedText |
|---|---|
| MSFT | Jun: high val <br> May: high roi, health, stab, qual |
| ETN | May: high roi; low qual, ratio, d/e, eps; bad d/e, stab |
| DTE | May: high ratio, proj eps; low d/e, health, eps, fwd eps; bad stab, d/e, roi, eps, ratio |
| EIX | May: high roi, eps, health, val; low proj eps; bad ratio, d/e, eps, stab |
(note that the <br> in MSFT combinedtext is an actual newline in the cell, which reddit can't display properly)
You can see that ETN and EIX both have the problem I'm trying to solve for: ETN has d/e appear as both low and bad, and EIX has eps return as both high and bad. I would like ETN to have d/e only appear as low, and I would like EIX to not list eps at all.
Before the grouping fields step, I'd like to do something like this pseudocode:
if count (sym,date,shortfield) <> count (sym,date,shortfield,newcomm) then if count(newcomm=high)>0,"","low") then remove blank newcomms and duplicate records
How do I accomplish this in Power Query?
[link] [comments]
Want to read more?
Check out the full article on the original site