Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sprint - June 24 to July 5, 2024 #23075

Closed
benjackwhite opened this issue Jun 19, 2024 · 10 comments
Closed

Sprint - June 24 to July 5, 2024 #23075

benjackwhite opened this issue Jun 19, 2024 · 10 comments
Labels
sprint Sprint planning

Comments

@benjackwhite
Copy link
Contributor

Global Sprint Planning

3 things that might take us down

Team sprint planning

For your team sprint planning copy this template into a comment below for each team.

# Team ___

**Support hero:** ___

## Retro

<!-- Grab the high and low priority items from last time and add whether that item was completed or not -->

- 

## Hang over items from previous sprint

<!-- For each item, decide to re-prioritise (and add below) or deprioritise -->

- Item 1. prioritised/deprioritise

## OKR

1. OKR, status (red/yellow/green) and action points if yellow/red


### High priority

-

### Low priority / side quests

-

@benjackwhite benjackwhite added the sprint Sprint planning label Jun 19, 2024
@benjackwhite benjackwhite pinned this issue Jun 19, 2024
@pauldambra
Copy link
Member

pauldambra commented Jun 19, 2024

Team Error Tracking, Tracker of Errors

Support hero: @daibhin

  • paul out a day
  • Manoel out 2 or 3 days (TBC)

items from previous sprint

High priority

  • hiring (@pauldambra) 🔴
  • hogql filters performance (@pauldambra ) ✅
    • death by a 1000 cuts here...
    • turned this on for $large-eu-customer now though
    • so we might be able to start deleting old code 🤞 now
  • masking text in the screenshot images on Android and iOS @marandaneto 🟡
    • Android
    • iOS just research/benchmarking with the Android results - more complicated - WIP 🟡
  • Error tracking (@pauldambra and @daibhin) 🟡
    • first pass UI out
    • user interview booked with $large-eu-customer
    • some grouping research done but lots of context switching 😓
    • UI is under a feature flag

OKR

  1. OKR, status (red/yellow/green) and action points if yellow/red
  • 🟡 quality and consolidation
    • mobile replay open beta
      • out free and being promoted in onboarding
    • replay captures every site
      • David working with rrweb to improve CSS parsing, already a major reduction in issues as a result
    • people can find the valuable recordings
      • universal filters in progress
  • 🟡 Error tracking MVP
    • started
  • 🔴 Hire at least 2 amazing colleagues

High priority

  • Q3 goals - everyone
  • masking text in the screenshot images on iOS @marandaneto
  • Replay React native plugin for Android and iOS @marandaneto
    • Check if it's possible to make the native plugin optional so it works with Expo OOTB.
  • Universal filters released for everyone @daibhin
    • write transformation from simple / advanced filters to new universal filters
  • Error tracking @daibhin @pauldambra

Low priority / side quests

  • persistent replay queue in posthog-js @pauldambra (TBC)
    • switching all of replay to async just for this is bothering me, so I'm going to park for now
  • we are recruiting beta testers for Android and iOS - (we should have screenshots now) @marandaneto @annikaschmid

@benjackwhite
Copy link
Contributor Author

benjackwhite commented Jun 19, 2024

Team CDP, deliverer of Hogs

Ben + Marius OOO for second week of sprint (50% capacity)

Retro

  • BW - Felt super productive
  • MA - Great progress all round

Hang over items from previous sprint

  • Have the new "CDP" service in production, flagged to Team 2 @benjackwhite
  • Get all of our existing Zapier actions (e.g. slack notifications) running as hog functions (dog-fooding) @benjackwhite
    • Not done - we weren't really using actions for this, but multiple destinations setup
    • (Maybe by end of sprint we have Customer.io / hubspot working)
  • Missing hog features to support the above (related issue) @mariusandra
  • Make editing Hog not suck (syntax highlighting, autocomplete) @mariusandra
    • Partially done - likely to be done end of sprint
  • no semicolons! @mariusandra
  • Brownbag on Hog and new CDP architecture

OKR

  1. 🔴 Get the new CDP in front of real users
  • why red? Planned vacations mean we don't want to hand it out and walk away immediately - that said, we are really not far off and probably could have hit this without vacation

Sprint plan

Goal: Have DeliveryHog ready for Closed beta testing
(See #22833)

  • Scaling work @benjackwhite
    • Overflow topic with detection for slow functions
    • Metrics, dashboards, alerts
  • Secret management (probably go for simple encrypted field in django) @benjackwhite
  • Hook up to rusty webhook service (hopefully Brett, if not Marius)
  • Ensure every existing plugin destination could be built with Hog and build them as templates @mariusandra
  • Go through existing CodeQL reports and sanity check the safety of the system @mariusandra

Side quests

  • Load testing of Hog functions in our own account
  • Test execution with sample event
    • Estimate of throughput for a function (based on filters)
    • Example event to choose from events matching filters
  • Add calculation and limits for resource usage (memory)

@daibhin daibhin changed the title Sprint - June 24 to May 5, 2024 Sprint - June 24 to July 5, 2024 Jun 19, 2024
@fuziontech
Copy link
Member

fuziontech commented Jun 19, 2024

Team Click Haus, Haus of the Hogs

OKR Q2 2024

Objective

James as a Service -> Clickhouse as a Service

  • Key Results:
    • Better Visibility
      • Regularly testing backups
      • Monitoring/alerting
      • Mutations
      • Moves
    • Management
      • Managing/killing mutations
    • Self Serve
      • Schema design feedback (James non blocking
      • Schema management
    • Automation
      • Replace/Upgrade replicas
      • Upgrading to 24.04
      • Disk configs

Board

https://github.com/orgs/PostHog/projects/85/views/2

Retro

Infra

  • 🔥 ZK incident followups
  • ✍️ Write up the outcome from CH strategy planning
  • 🏃 Onboarding Dani
  • ♻️ Replace offline nodes w/ nvme nodes
  • 📟 Monitoring and Alerting on EU Coordinator
  • 🧪 Test incremental backup restores
  • ⏩ Move parts around so last 3 months of data are on NVME on US
  • 🗑️ Delete persons on teams that are still ingesting data (for personless events)
    - [ ] 👯 Debug duplicate events in sharded_sessions on EU (bad replication)

High priority

Infra

  • Retire old Offline Nodes on US Cluster @fuziontech
  • Monitoring and alerting on clickhouse_events_json_debug table : events table counts (both)
  • 🧪 Test incremental backup restores @Daesgar
  • ⏩ Move parts around so last 3 months of data are on NVME on US @Daesgar
  • Remove projections in EU on events table @fuziontech
  • 🗑️ Delete persons on teams that are still ingesting data (for personless events) @fuziontech
  • Configs in Ansible for ClickHouse US and EU @Daesgar
  • 🏃 4 new im4gn.16xlarge replicas for US @fuziontech

@benjackwhite
Copy link
Contributor Author

Team Infra

Retro

  • Ben out for 50%
  • Sašo out 1 day
  • Frank out 100%

Hang over items from previous sprint

  • 🟢 SEIM:
  • 🟡 secrets
    • 🟢 vault monitoring
    • 🟢 vault is all configured with TF
    • 🔴 backups @danielxnj
      • Decided to move to use a proper database backing which will take care of backups for us
    • 🔴 CDP integration @ZeleniJure
      - opted against this as it isn't fit for purpose (scale we use it etc.)
  • 🟡 VPA dashboard to understand over-provisioned resources @ZeleniJure
    • Still last tweaks in prod to get it working (and then we need to implement the recommendations)
  • 🟢 roll out new codified deployment strategy @frankh
  • 🟢 Other cost optimization stuff
    • Removed the overprovisioning deployments
    • Tagged Clickhouse disks (manual atm as it is still not in terraform)
    • S3 costs reducing thanks to incremental backups of clickhouse
  • 🔴 internal RBAC review (tbc)
  • 🟢 S3 mounted volume for Ingestion (geoip + blobby models)

OKR

  1. 💪 Deploy with confidence 🟢
  • Finalize our Canary Deploy process 🟡
  1. 🚨 Improved alerting and monitoring 🟢
  • SIEM work has made this a priority
  1. 🔒 Deeper Security 🟡
  2. 💰 Continued cost control 🟢

High priority

  • Q3 Planning @benjackwhite
  • Vault in use for production secrets (k8s configuration for example) @ZeleniJure @danielxnj
    • why? We've done most of the thinking, now we just need to do it to complete our "auditable security quarter"
  • Over-provisioned resource review @ZeleniJure
    • why? Part of cost saving efforts to reduce our over provisioned deployments
  • Audit all our AWS IAM accounts (we have everything in SSO so we shouldn't have any normal accounts there) @danielxnj

Side quest

  • access logs to django @danielxnj
  • Make sure all kafka topics are in IaC (e.g. clickhouse_heatmap_events)

@neilkakkar
Copy link
Collaborator

neilkakkar commented Jun 19, 2024

Team Feature Success

Support hero: @dmarticus
Days off:
Juraj: 5 days
Phani: 0 days
Dylan: 0 days
Neil: 2 days

Retro

Hang over items from previous sprint


OKRs

  1. Make sure feature flags can handle 10x current scale
  2. Polish new experiments UI & collect feedback
  3. Add most requested surveys functionality

High priority

Low priority / side quests / maybe Neil will get to this next year

  • Temporal queues for feature success - @neilkakkar
  • Setup instrumentation for flip-flopping problem of experiment significance - @neilkakkar

@EDsCODE
Copy link
Member

EDsCODE commented Jun 19, 2024

Team Data <->, collecting of Hogs and more

OKR Q2 2024

Objective

Release data warehouse to everyone

  • Key Results:
    • Integration first experience
      • schemas are reliable
      • modeling of each integration is clear
      • Good automatic roll up views and joins
      • Wizard to onboard people
    • Establish a solid pattern to build integrations
    • Complete data warehouse experience in the rest of the app (insights, feature flags, experiments)

Retro

  • incremental syncs/refactored to new DLT sync system which will make integrations just a config @Gilbert09
  • snowflake integration @Gilbert09
  • templated joins @EDsCODE
  • UI revamp @EDsCODE
  • sync frequency @EDsCODE
  • collecting user feedback @EDsCODE
  • warehouse props in feature flags and experiments
  • person model batch exports @tomasfarias

High Priority

Side quests:

  • warehouse props in feature flags and experiments
  • templated joins

@raquelmsmith
Copy link
Member

raquelmsmith commented Jun 19, 2024

Team Growth

Retro

Retro items
  • @raquelmsmith
    • Finish roll-out of personless events
      • Update migrate script to also report usage at time of migration
      • Roll out to 5-10k customers at a time
      • Fix changelog link
      • Fix sentry errors
    • Referrals: add/edit referrer and redeemer
    • Start working on toolbar dashboard template thing
    • Prune usage report queries
    • Unexpected bill emails
  • @zlwaterfield
    • clean up outstanding feature flags (reverse-proxy-onboarding and email-verification-ticket-submission)
    • ship outstanding PRs (Teams addon - remove modal, Projects pay gate on free should be to paid not teams)
    • subscribe to all products
      • main API changes (basic serializer set up, products=all query, default paid/free plan map, functions for upgrading/downgrading
      • feature flag in UI on billing page with main button alterations (no comparison or other UI changes)
    • startup plan metadata RFC
    • emails new teams addon customers
    • re-run the plans map and compare with the new auto-cancel functionality

Q2 Goals

✅=finished 🟡=in progress 🔴=won't finish

  1. 🟡 Create a flow in product analytics onboarding to fill out a dashboard template using actions (Raquel)
  2. 🟢 Simplify our subscription flows (Zach, supported by Raquel)
    • ✅ Move teams plan to being an addon
    • 🟢 Change to a single subscribe for all products
  3. ✅ Launch pricing changes (Raquel)
    • ✅ Personless events - will help us reach more customers at an affordable price

This sprint

High priority

  • Q3 planning
  • @raquelmsmith
    • Support for first week
    • Pricing page experiments - iterate here with cory and eli until it's done
    • Stay on top of revenue issues
    • Start working on toolbar dashboard template thing
    • Keep on top of personless comms and customer issues and metrics
    • Lots of interviews...
  • @zlwaterfield
    • Complete subscribe to all products
      • frontend changes
      • release under feature flag to new users
      • backfill existing users and communicate with them
      • (if time permits - probably next sprint) cleanup! remove/clean single product subscribe code where we can.
    • Start on the Stripe metadata changes - close RFC, updates to Zapier, work on backfill, etc.
    • re-run the plans map and compare with the new auto-cancel functionality

@bretthoerner
Copy link
Contributor

bretthoerner commented Jun 19, 2024

Team Pipeline

Off: Tiina
Support: Xavier

Retro

High priority

  • Improve ingestion efficiency by moving the last Kafka waits at the end of the batch (Brett)
  • New customers write to person only from $set and $identify, $create_alias, $merge_dangerously and new products also only read persons (Tiina)

OKR

✅=finished 🟢=on track to finish this quarter 🟡=might not finish 🔴=won't finish
✔️=progressed last sprint ; ➡️=planned work for this sprint

🟢✔️ Deprecate posthog-events by moving to capture-rs fully (21659)
🟢✔️ Visibility into what's in the ingestion queues and past performance (20985)
🟢✔️ Fast configuration options to speed up incident recovery, e.g. by token send to overflow or drop (21662)
🟢✔️ Iterate on person processing to make it faster and cheaper (21048)
🟢✔️ Batch exports UX improvements, e.g. error notifications and UI rewamp (21139)
🟢 Support adding new products (21665)
🟢✔️ Deprecate scheduler & jobs deployments, runEveryX plugins and kafkajs consumers (21656)

High priority

  • Fix excessive overrides written in support of Personless mode (Brett)
  • Hog support for Rusty-Hook (Brett) (This takes backburner to the one above)
  • capture-rs: fix billing limits (Xavier)

Low priority / side quests

  • Finish hog-rs to posthog repo migration (deploy out of posthog through state.yaml)
  • Collect rdkafka metrics (broker response latency, error rates) for all node producers & consumer (Xavier)
  • capture-rs: read redis out-of-band (avoid latency if redis slow)

@robbie-c
Copy link
Collaborator

robbie-c commented Jun 19, 2024

Team web analytics session table

Support hero: @robbie-c

Retro

Some personless events work (low priority tidying up of posthog-js) was blocked on https://github.com/PostHog/product-internal/pull/612/files, so I got to work on the sessions table instead. It's pretty close to being ready, maybe by the end of this sprint. PR: #23023 . I've done the first 90%, just finishing off the second 90%

  • 🟢 Start emailing customers for research (emailed ~30, spoke to 5)
  • 🟢 Posthog-js: Rewrite posthog-js person props backfilling to include initial current url
  • 🛑 Posthog-js: handle setPersonPropertiesForFlags properly when personless
  • 🛑 Posthog-js: handle manually calling $set properly when personless
  • 🛑 Write RFC for client changes for personless events based on posthog-js, to be approved by mobile sdk teams
  • 🆕 Get the sessions table v2 ingesting events

Customer interviews were useful, mostly reasonably obvious stuff that's already on my roadmap (separating sites better, live user count, auto refreshing) but leads.io were extremely specific, and would need conversion rate per landing page.

OKR

  1. Make querying fast enough for large customers
  2. Do personless events work where necessary
  3. Iterate on customer feedback
  4. Product management work

High priority

  • Get session table v2 PR over the line
  • Start backfilling, prioritising EU, and team 2 on US for dogfooding

Stretch goal

  • Get a versions of WA up that is terrifyingly fast because it can just use the sessions table + it can sample

@Twixes
Copy link
Collaborator

Twixes commented Jun 19, 2024

Team Product Analytics

Support hero: @aspicer @skoob13

Time off: Michael half the sprint, Sandy two days

Retro

High priority

  • 🟢 Getting rid of remaining legacy filters use (owner: @thmsobrmlr) – much refactoring done, though some bits still to be done
  • 🥏 Query performance measurement (owner: @aspicer) – deprioritized overall, scope turned out to be much larger. Turned to more targeted optimizations (e.g. Py 3.12)
  • 🔴 Project Environments finally (owner: @Twixes) – didn't get to it

Low priority / side quests

  • 🔴 Insight background reloads monitoring/cleanup (@webjunkie) – focused on the support queue
  • 🟢 Onboarding Georgii (owner: @Twixes) - multiple breakdowns are progressing

Extra things done

  • Q3 planning
  • Persons-on-events cleanup
  • Live events deployment (thanks to @fuziontech and @frankh)
  • 3 issues with our Zapier integration that Zapier forwarded to me (@Twixes) last week
  • Support overflowing – Julian's been doing a great job turning the reports into solvable issues, but at our volume there isn't enough time for the hero to also ship all the fixes in time
  • Working with a contributor, Nikita, who's in the process of shipping analytics alerts

Hang over items from previous sprint

  • Getting rid of remaining legacy filters use, the remaining bits (owner: @thmsobrmlr)
  • Project Environments (owner: @Twixes)

OKR

  • HogQL-based querying

    • Convert the remaining legacy queries to HogQL and release to public (Thomas, Julian, Marius)
      • 🟢 Insights – rolled out, although fixes needed to be rock-solid
      • ⚪ Cohorts
    • Remove legacy querying backend (Thomas, Julian)
      • 🟢 Clean up or rewrite dashboardLogic
      • 🟠 Convert filters to query (insights, notebooks, activity log, experiments) 👈 Thomas is on it
    • Missing Product Analytics features (Thomas, Julian)
      • 🟠 Breakdowns (multiple) in literally everything 👈 Georgiy is on it
      • 🟠 Make a list based on GitHub issues from customer requests…
      • ⚪ Fix those issues
    • Missing HogQL features (Tom, Marius)
      • 🟠 Type system, JSON 👈 Tom is on this, being in Data Warehouse
      • ⚪ Missing things when building funnels
  • Querying and processing performance (Thomas, Julian)

    • Global performance overview dashboards 👈 Sandy is on it
      • 🟠 Insights
      • ⚪ Exports
      • ⚪ Cohort recalculations
    • Query request tracing
      • ⚪ Possibly query runner Python optimizations
      • ⚪ Exports improvements
    • ⚪ Identify top 5 query optimizations in terms of impact
  • Artificial Hog / Post Intelligence (Michael)

    • ⚪ Ask a question to get a magical insight (aware of your taxonomy)
    • ⚪ Figure out infra for upgrading queries and models
    • ⚪ Product-wide framework for opting into sharing with OpenAI
  • Activation (side quest: Michael)

    • ⚪ Michael to work with Growth to identify optimizations to getting started with Product Analytics 👈 pending the product manager hire a bit

High priority

  • Experiments migrated from legacy trends/funnels to HogQL-based (owner: @thmsobrmlr)
  • Getting rid of remaining legacy filters use continued (owner: @thmsobrmlr)
  • Multiple breakdowns in Trends released to users (owner: @skoob13)
  • Project Environments (owner: @Twixes)
  • Insight background reloads monitoring/cleanup (@webjunkie)

Low priority / side quests

  • Working with Nikita to get analytics alerts shipped (owner: @aspicer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sprint Sprint planning
Projects
None yet
Development

No branches or pull requests

10 participants