Greybeam’s cover photo
Greybeam

Greybeam

Software Development

San Francisco, California 700 followers

Multi-engine Query Orchestrator

About us

Snowflake isn't expensive. In fact, it can be one of the cheaper cloud data warehouse, but only if you set it up properly. Greybeam can help with that. Save up to 90% with a multi-engine data stack.

Website
https://www.greybeam.ai
Industry
Software Development
Company size
2-10 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2024

Locations

  • Primary

    149 New Montgomery St

    San Francisco, California 94105, US

    Get directions

Employees at Greybeam

Updates

  • dbt Labs brought chaos into order for all analytics teams.

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    What's the one thing engineers LOVE about dbt Labs? This week Arsham and I ran around Coalesce armed with our magic microphone to ask a few burning questions. We cornered founders, engineers, and everyone in between to get their unfiltered takes. 🎥 Here’s part 1 of our mini-series. What's YOUR favorite thing about dbt? Drop it below 👇 Massive thanks to the brave souls who let us ambush them: Oliver Laslett, Rick Radewagen, Christophe Blefari, William Tsu, Jon Johnson, Benjamin Segal, Chloé L., Ben Lerner, Melanie, Jericho Villareal, David S., and Jay 🙏

  • But you should Select Star!

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    This is why you should never SELECT *. Snowflake and DuckDB are columnar databases. They store each column separately. When you only need 3 columns from a 80-column table, they should only read those 3 columns from disk, this is called column pruning. Most query optimizers, like the ones in Snowflake and DuckDB, will automatically detect columns that are referenced but not used in a query and prune them. For example, in the query below, the optimizer should know to only retrieve name from the customers table despite calling SELECT * in the subquery. SELECT name FROM (SELECT * FROM customers) Most of the time, it should just work. But certain things can break it. I ran into this while testing DuckDB 1.4.0 (upgrading from 1.3.2) and noticed queries that once took 10s were now taking over 100s. The updated query optimizer was not pruning 60 unused columns from a 2 billion record table all because there was a SELECT * in an early CTE. Snowflake is guilty of the same, but the difference in execution time is not as stark. Try it yourself!

    • No alternative text description for this image
    • No alternative text description for this image
  • Greybeam reposted this

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    1 million queries routed to DuckDB. 82% cost savings. 10x faster queries on average. One of our earliest Greybeam customers just crossed this milestone 🎉 At the beginning of this year when Arsham and I started buildig Greybeam, we hypothesized that executing Snowflake queries on DuckDB would save customers up to 70% on their workloads. Now that we've been in production for some months with a small handful of customers, we will need to revise this number. For some workloads, we've seen this number reach a staggering 95% while making queries faster on average. It's amazing to see how far we've come, and even more amazing to deliver on our original promise. See you at 10 million!

    • No alternative text description for this image
  • Why is my query taking 9 hours? Chances are your queries fall into 1 of these 7 query pattern traps https://lnkd.in/gpwUTj_E

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    The cleaner your SQL looks, the worse it might perform. Take this perfectly readable join: FROM orders JOIN customers ON orders.customer_id = customers.id OR orders.legacy_customer_id = customers.id Looks elegant. Runs terribly. Why? Snowflake can't use hash joins with OR conditions! There is no longer one clear key to partition on and the query optimizer has to account for matches on either column. The ugly fix that actually works: SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id UNION SELECT * FROM orders JOIN customers ON orders.legacy_customer_id = customers.id Less readable but nearly 40x faster. This OR join pattern shows up everywhere and it's just one of many innocent looking query patterns that destroy performance. I just wrote a blog that deep dives into 7 ways to make your queries run faster. The others include: - Join optimization - Query pruning - Proper use of warehouse cache - Choosing the right query engine If your queries are slow and expensive, these patterns might be why. 👇 https://lnkd.in/g_v7PjPM edit: the SF10 on the left is a screenshot error. If we run the same query on SF10 it would've taken 2h 57m!

    • No alternative text description for this image
  • Greybeam reposted this

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    The best way to optimize Snowflake? Don't use it (for everything). Snowflake excels at complex heavy analytics, but nearly all everyday queries are small, often only looking at a slice of data. Last 12 months, a specific customer, a specific segment, etc. Small queries run faster and cheaper on small engines like DuckDB. You wouldn't use a car wash to clean your phone screen. So why would you use Snowflake for your small queries? Your CFO is crying and your data team is spending more time optimizing costs than building. Our customers are saving over 80% on Snowflake costs while queries run 7x faster through Greybeam because they're finally using the right engine for each query. What % of your Snowflake queries could run just fine on DuckDB?

  • The Snowflake Summit already costed you an arm and a leg. Don't let their queries do the same!

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    To those attending the Snowflake Summit, be wary of mad men jumping over the stalls... They're trying to save you money. 💰 Using a big flush 🚽 for your small twinkles ✨ is costing you time and money. If toilets can have a small flush, why can't Snowflake? Most queries don't need massive clusters--but you're paying for them anyway. Greybeam gives you the missing small flush option. Light queries go to DuckDB, heavy analytics stay on Snowflake where they belong. Stop flushing money down the drain. And start saving up to 85% on Snowflake costs because you're finally using the right tool for the right job.

  • Be like Smitty. Be #1.

    View profile for Kyle Cheung

    ❄️ co-founder & ceo | Run Snowflake queries on DuckDB, save up to 85%

    Have you always aspired to be like Smitty Werbenjagermanjensen 🧽 ? If the answer is yes, join us at Greybeam and be employee #1. 📍 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻: In-person in San Francisco 💵 𝗖𝗼𝗺𝗽𝗲𝗻𝘀𝗮𝘁𝗶𝗼𝗻: $160K - $200K + equity 📋 𝗪𝗵𝗮𝘁 𝘆𝗼𝘂'𝗹𝗹 𝘄𝗼𝗿𝗸 𝗼𝗻: backend heavy and you'll do a little bit of everything More details here 👉 https://lnkd.in/gnG9fAKK

    • No alternative text description for this image

Similar pages

Browse jobs