Local Data Directory (gitignored)

This directory holds raw Health Data Bridge exports needed by charts.mjs to regenerate the visualizations embedded in the sibling reports (report.html, caffeine-multidim.html, cycling-performance.html, cola-drill-down.html).

Raw *.json files are gitignored — they contain personal HealthKit data (timestamps, biometric values, food labels) and we keep them out of the repo deliberately. Only this README and .gitignore are tracked.

Required files

After populating this directory locally, the chart generator expects:

File	Health Data Bridge metric	Mode	Approx size
`caffeine_1y.json`	`quantity_sample:dietaryCaffeine`	raw	~110 KB
`sleep_1y.json`	`category_sample:sleepAnalysis`	raw, paginated	~1.1 MB
`hp_hrv.json`	`quantity_sample:heartRateVariabilitySDNN`	bucketed/day	~40 KB
`hp_rhr.json`	`quantity_sample:restingHeartRate`	bucketed/day	~40 KB
`hp_steps.json`	`quantity_sample:stepCount`	bucketed/day	~36 KB
`hp_kcal.json`	`quantity_sample:activeEnergyBurned`	bucketed/day	~38 KB
`hp_workouts.json`	`workout`	raw	~66 KB
`hp_glucose.json`	`quantity_sample:bloodGlucose`	raw, paginated	varies

Reproduction (full pull)

Prereqs: Health Data Bridge iOS app paired and reachable. Verify with cd web/packages/healthbridge-cli && bun run doctor (expect Overall: healthy).

cd web/packages/healthbridge-cli

WIN_START=2025-12-19T00:00:00+09:00
WIN_END=2026-04-11T23:59:59+09:00
DATA=../../research/taeho-health/sleep-caffeine/data

# Caffeine (raw)
bun run query --object-type quantity_sample --identifier dietaryCaffeine \
  --mode raw --timezone Asia/Seoul \
  --start-at "$WIN_START" --end-at "$WIN_END" \
  --limit 1000 --json > $DATA/caffeine_1y.json

# Sleep (raw, paginated — 4 pages typically cover 110 days)
for c in 0 1000 2000 3000; do
  bun run query --object-type category_sample --identifier sleepAnalysis \
    --mode raw --timezone Asia/Seoul \
    --start-at "$WIN_START" --end-at "$WIN_END" \
    --limit 1000 --cursor "$c" --json > $DATA/sleep_p$c.json
done
node -e '
  const fs=require("fs");
  const ps=[0,1000,2000,3000].map(c=>JSON.parse(fs.readFileSync(`'$DATA'/sleep_p${c}.json`)));
  const merged={...ps[0],records:ps.flatMap(p=>p.records),nextCursor:undefined};
  fs.writeFileSync("'$DATA'/sleep_1y.json",JSON.stringify(merged));
'

# HRV / RHR / Steps / Calories (daily bucketed)
for METRIC in heartRateVariabilitySDNN:hrv restingHeartRate:rhr stepCount:steps activeEnergyBurned:kcal; do
  ID="${METRIC%:*}"
  OUT="${METRIC#*:}"
  bun run query --object-type quantity_sample --identifier $ID \
    --mode bucketed --bucket day --timezone Asia/Seoul \
    --start-at "$WIN_START" --end-at "$WIN_END" \
    --limit 1000 --json > $DATA/hp_$OUT.json
done

# Workouts (raw)
bun run query --object-type workout \
  --mode raw --timezone Asia/Seoul \
  --start-at "$WIN_START" --end-at "$WIN_END" \
  --limit 1000 --json > $DATA/hp_workouts.json

# Glucose (raw, paginated — typically 3-4 pages)
for c in 0 1000 2000 3000; do
  bun run query --object-type quantity_sample --identifier bloodGlucose \
    --mode raw --timezone Asia/Seoul \
    --start-at "$WIN_START" --end-at "$WIN_END" \
    --limit 1000 --cursor "$c" --json > $DATA/glucose_p$c.json
done
node -e '
  const fs=require("fs");
  const ps=[0,1000,2000,3000].map(c=>JSON.parse(fs.readFileSync(`'$DATA'/glucose_p${c}.json`)));
  const merged={...ps[0],records:ps.flatMap(p=>p.records),nextCursor:undefined};
  fs.writeFileSync("'$DATA'/hp_glucose.json",JSON.stringify(merged));
'

Regeneration

cd <repo root>
node research/taeho-health/sleep-caffeine/charts.mjs

This reads everything in data/, computes per-day aggregates, and writes:

research/taeho-health/sleep-caffeine/charts/*.svg — individual SVG files (also tracked in git)
Updated HTML reports (report.html, caffeine-multidim.html, cycling-performance.html, cola-drill-down.html) with embedded chart cards

If data/ is empty, the script logs which files are missing and skips the charts that depend on them — partial regeneration still works.