A user table stores email addresses and you need to understand which email domains are most popular. Extract the domain from each email and return a count per domain, sorted by user count descending then domain name ascending to break ties.
df
| column | type |
|---|---|
| user_id | int |
| str |
| user_id | |
|---|---|
| 1 | alice@gmail.com |
| 2 | bob@yahoo.com |
| 3 | carol@gmail.com |
| 4 | dave@gmail.com |
| 5 | eve@yahoo.com |
| domain | user_count |
|---|---|
| gmail.com | 3 |
| yahoo.com | 2 |
Three users have gmail.com addresses and two have yahoo.com, so gmail.com ranks first by user count.
user_id = [1, 2, 3], email = ['alice@gmail.com', 'bob@yahoo.com', 'carol@gmail.com']