R/get_teams_attendance.R
get_teams_attendance.RdCombines several Microsoft Teams attendance reports (one downloaded CSV file per workshop day) into a single tibble with one row per unique participant and one column per day holding the attendance duration.
get_teams_attendance(
files,
unit = c("hours", "minutes"),
match_by_email = TRUE,
merge_contained_names = TRUE,
export = TRUE,
xlsx_path = NULL
)Character vector of paths to the Teams attendance CSV files (one per day).
Either "hours" (default) or "minutes"; the unit of
the per-day duration cells. Hours are rounded to one decimal place.
Logical; if TRUE (default), authenticated users
are merged across days via their e-mail address (column userName),
so that display-name variants like "Max M." and
"Max Mustermann" collapse into one row. Anonymous guests (no e-mail)
are always matched by display name only.
Logical; if TRUE (default), display-name
variants are merged when one is a prefix of another after normalisation
(lowercased, ignoring spaces and punctuation), provided the shorter name
has at least two word tokens. Merging is transitive, so all variants
sharing a common name prefix collapse into one row. For example
"Max Mustermann", "Max Mustermann, MRI",
"Max Mustermann_FIRMA", "Max Mustermann BLE 624" and
"Max Mustermann BLE624" all collapse into a single participant. The
two-word minimum keeps single-word names such as "Max" from pulling
in unrelated people (e.g. "Maximilian" or "Max Power").
Logical; if TRUE (default), an Excel workbook with two
worksheets is written: "Zusammengeführt" (the consolidated table)
and "Roh" (the unconsolidated reference table). The function then
returns the consolidated tibble invisibly. Set FALSE to skip the file
and return the tibble visibly, as before.
Optional path for the exported workbook. If NULL
(default), the file is written as teams_attendance.xlsx into the
folder of the first input file (dirname(files[1])). Supplying a path
implies export = TRUE; a missing .xlsx extension is added.
An existing file is overwritten without asking.
A tibble with columns name, email (often empty) and one
numeric column per day (named by the date). A cell value of NA means
the participant did not attend on that day; 0 means they joined but
without measurable duration. The attribute "unmerged" holds the
unconsolidated version of the same table. With export = TRUE the
tibble is returned invisibly and an Excel file is written as a side effect.
The expected input is the flat, comma-separated CSV that Teams
provides, with one row per join/leave segment. The relevant columns are
display (display name), userName (e-mail / UPN, often empty),
joinDateTime and leaveDateTime (ISO 8601 timestamps). The day
of a report is taken from joinDateTime; multiple segments (rejoins)
on the same day are summed.
File encoding is detected automatically via the byte-order mark: files
starting with FF FE are read as UTF-16LE, otherwise UTF-8 is assumed.
The canonical name shown for a merged participant is the longest display
name in the group (ties broken by frequency, then alphabetically).
Participations without a display name (e.g. dial-ins) are labelled
"(ohne Namen)"; those without an e-mail collapse into one such row.
The returned tibble carries an attribute "unmerged" holding the same
table without any consolidation (one row per distinct display name, no
e-mail and no name merging). It is meant as a side-by-side reference, e.g.
as a second sheet in an exported workbook: attr(result, "unmerged").
Access it directly, as most dplyr operations drop attributes.
When export = TRUE (or xlsx_path is supplied), an Excel
workbook is written with two worksheets: "Zusammengeführt" (the
consolidated table) and "Roh" (the unconsolidated table). Both sheets
get a bold, frozen header row, auto-fitted column widths and a white-to-green
colour scale on the per-day duration columns (white = low, green = high
attendance); empty cells (absences) stay uncoloured.
Known matching limits: name merging is intentionally aggressive, so two
different people whose names share a common prefix (e.g. the same first and
last name) may be merged into one row. Use the "unmerged" attribute
to cross-check. One person appearing under different e-mails stays in
separate rows unless their names also merge by prefix.
files <- list.files(
system.file("extdata", "teams_attendance", package = "BioMathR"),
full.names = TRUE
)
if (length(files) > 0) {
res <- get_teams_attendance(files, export = FALSE)
res
# Unconsolidated reference version (e.g. for a second worksheet):
attr(res, "unmerged")
}
#> # A tibble: 3 × 5
#> name email `2026-05-18` `2026-05-19` `2026-05-20`
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Anna Gast "" 0.8 0.5 NA
#> 2 Max M. "max@firma.de" NA 1 NA
#> 3 Max Mustermann (FIRMA) "max@firma.de" 0.5 NA 0.5
# \donttest{
if (length(files) > 0) {
# Export both sheets to an Excel workbook:
get_teams_attendance(files, xlsx_path = tempfile(fileext = ".xlsx"))
}
#> Excel-Datei geschrieben: C:\Users\PAULSC~1\AppData\Local\Temp\Rtmp08knor\filea2fc70052e4f.xlsx
# }