-
-
Save arboledasaa/80d360ffdca4ba026344f8a99c808b0d to your computer and use it in GitHub Desktop.
I’m not worried about performance for the sake of this exercise.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--In the Users table, the Location column’s datatype is NVARCHAR(100). | |
--People can type in whatever they want, which leads to a lot of oddball one-off locations. | |
--We should filter for a minimum number of people involved, especially since we’re considering paid advertising. | |
--Let’s filter for only locations with at least 100 people in them. | |
--There are null and empty locations. Those should probably be excluded. | |
SET STATISTICS IO ON; | |
BEGIN TRY | |
DROP TABLE #locations; | |
END TRY | |
BEGIN CATCH SELECT 1 END CATCH; | |
Select [Location] | |
into #locations | |
from [dbo].[Users] | |
where [Location] is not null | |
and [Location] <> '' | |
group by [Location] | |
having count(*) >=100; | |
/* | |
Missing Index Details from SQLQuery6.sql - BVQ647-ZBOOK14G.StackOverflow2013 (DS\BVQ647 (51)) | |
The Query Processor estimates that implementing the following index could improve the query cost by 90.8365%. | |
*/ | |
/* | |
USE [StackOverflow2013] | |
GO | |
CREATE NONCLUSTERED INDEX Location | |
ON [dbo].[Users] ([Location]) | |
GO | |
*/ | |
/* | |
are we going to keep asking this question? if so maybe create the index. | |
We are as we are going to keep asking who is source and sink. | |
before | |
logical reads 45184 | |
after | |
logical reads 3030 | |
*/ | |
--Part of your solution is going to involve looking at the Posts table, which stores questions and answers (and other things.) A few columns to consider: | |
--Posts.PostTypeId = when 1, the Post is a Question. When 2, it’s an Answer. | |
--Posts.OwnerUserId = the Users.Id who owns the question or answer. | |
--Posts.Score = the quality of the question or answer. | |
--For this exercise, I’m not expecting a specific “right” or “wrong” answer – instead, for this one, you’re probably going to have a good time sharing your answer in the comments, and comparing your answer to that of others. Feel free to put your queries in a Github Gist, and include that link in your comments. I’ll circle back in a week and write about different ways to solve it. Have fun! | |
--They can be separate queries, and I’m not worried about performance for the sake of this exercise. | |
--First, what are the top 10 Locations populated with users who seem to be really helpful, meaning, they write really good answers? | |
--maybe this? | |
select top(10) [dbo].[Users].[Location], avg(Posts.Score) as avgScore | |
from [dbo].[Posts] | |
join [dbo].[Users] on [Posts].OwnerUserId = [Users].Id | |
join #locations on [dbo].[Users].[Location] = #locations.Location | |
where Posts.PostTypeId = 2 | |
group by [dbo].[Users].[Location] | |
order by avgScore desc; | |
--Second, what are the top 10 Locations where people seem to need the most help? | |
--Meaning, they ask a lot of questions, but they do not seem to be answering those of their neighbors? | |
with x as( | |
select [dbo].[Users].[Location], | |
sum(case when Posts.PostTypeId = 1 then 1 else 0 end) as questions, | |
sum(case when Posts.PostTypeId = 2 then 1 else 0 end) as answers | |
from [dbo].[Posts] | |
join [dbo].[Users] on [Posts].OwnerUserId = [Users].Id | |
join #locations on [dbo].[Users].[Location] = #locations.Location | |
group by [dbo].[Users].[Location] | |
) | |
select top(10) x.Location | |
from x | |
order by questions/(answers+0.1) desc --maybe add weights here? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Just FYI, here's my edited version of your query to show more columns for sniff testing: