Posts

Deep dive in Excel

Image
Welcome back to my blog,  If you can use MS Excel efficiently and take out insights from data, you are a data analyst.  Yes being a data analyst doesn't mean you have to know python or SQL or any other fancy stuff.  So when we talk about Excel we have image of cell(rectangles) where we input out data. After that we make table and then use conditional formatting like things.  But when we go beyond that we are on the path to analyse the data.  So let's start it :-  Step 1 - We open the data and remove duplicates  We go to data tab and locate to remove duplicates icon then click it. Duplicates are gone. The thing to have in mind during that step, never select a row while removing duplicates it will remove values in that specific column only but we want to check all attributes then remove duplicates.  Step 2- find null values, so the easiest way to do this is first select a null cell, then go to select similar all the null values are selec...

Formula VS Functions in Excel/ spreadsheet

Image
Welcome to the new blog :-) In this blog I will tell you what's basic difference between formulas(formulae) and functions while using Excel. For an example,  Formulas are like homemade pizza where we can add things with variety and quantity by our own taste. AND  Functions are like pizza from Domino's or Pizza Hut, where we just tell name, size or extra cheese(all the arguments). Not fully customized. It is fine to make single or double topping pizza at home (formulas using operators only). But as we all know most of the time we have lack of experience, lack of ingredients (options) at home to make pizza (formula writing) Which makes the pizza bit different from Domino's and even less tasty(functional). At this point readymade pizza come more handy so we order it(use functions which are inbuilt) Definitions :- Formula :   A set of instructions used to perform a calculation using the data in a spreadsheet. Example of formula :- = (B3+A2)*5 Function :  A preset command...

R for data Analysis

Image
                    Welcome to the blog where I will tell you how everything works in analysis using R, It is just an overview but you will come to know relation between R and data analysis. This blog will also make you fearless to start with R. So let's start, R is a programming language we all know this, but you will be glad to know it is language which is specially made to analyze the data. The thing is from where to start right ? First see benefits of using R: 1. It is easy to learn language even if you have no coding or technical background. 2. It is like super-market where you will find every tool for analysis. 3. It is one stop solution where we analyze data, make visualization and create documentation. 4. It is open source or free to use. 5. Strong community make it easy to find solutions if you stuck somewhere. Ok, Ok starting how to start and how everything works, first you will go online online to R studio cloud or y...

Python for Data Analysis

Image
Welcome back to my new blog , Today we are going to touch the programming language python. Python is a language which we use for programming. And with python the original coding part is starting for data analysis.  We use two libraries of python for analysis, 2 for scraping and 1 for data visualization.  We can it's the whole package for data analysis big or small data doesn't matter here. We use it majorly for big datasets only. Let's see how it works:-  First we scrape the data, which means to take out data from online sources. This part is not used for internal data. But if you don't have data then you can python for data scraping.  The libraries we normally use for scraping are: scrape and beautifulSoup. We can pull data from websites or web apps by using them. Second  we prepare, process and analyse data.  As we discussed before Prepration contain to sort and filter data, Processing is to make data clutter free by cleaning the data along w...

SQL

Image
Welcome again everyone,  Today we are gonna touch the upper surface of usage of SQL in data analysis.  SQL Stands for Structured Queries Language. It is language which we use to talk with database. Each line of code is called query.  It is like talking to someone in his/her regional language and asking for some work to done.  We request database to do a work for us in database language (SQL) and take the outcome.  Cool so this was the concept of SQL.  Now how we use it.  It has very less syntex and spaces don't matter in SQL Let's see a small query SELECT *  FROM database.dataset.table WHERE  Attribute = "varchar" In this query we are using * for selecting all the columns. In FROM we selected table WHERE give a condition to see table. There are a lot of different functions in SQL, following are few common functions- - SELECT -WHERE -SELECT -ORDER BY -CAST -DISTINCT  -COUNT  -LIMIT There is also concept of subqueries, whi...

Excel and Spreadsheet

Image
Welcome again, In today's blog we will learn about one of the most important and most extensively used tool for data analysis which is Spreadsheet or Excel. We will also discuss few terms of data which are important to know for starting career in data science. Excel or Spreadsheet :- This is the tools with whom we all are familiar we learn it in computer basics and know how it's formulae and function works. So following are few facts about Excel or Spreadsheet. - It is most popular software for data analysis. - It can handle small data sets very easily. - It is capable for data visualization too. - Excel is tool from Microsoft and Spreadsheets is from Google. What are the uses of Excel or Spreadsheet for data analysis? We can easily type and feed data in excel. There are columns and rows in excel to fill and save the data in a format. Where both column and row cross there is a cell each cell has his address and we fill our data in those cells to create tables in excel. Followin...

Road Map For Being a Data Analyst

Image
Hello Guys, It's beginning of data analysis.  Follow the blog to learn more about data analysis and data.  There are plenty of tools which we use for collecting, preparing, processing, analyzing and presenting data. Like : Python, R, MS Excel, Google spreadsheets, MS SQL, SQL, BigQuery, Postgre SQL, PowerBi, Tableau etc. So I am going to categorize these tools according to there uses at different stages of data analysis : 1- For  COLLECTING DATA :- Python, MS excel, Spreadsheets. 2-For PREPARING DATA :- Excel, SQL, Spreadsheets, MS SQL, Postgre SQL 3-For PROCESSING DATA :-   Excel, SQL, Spreadsheets, MS SQL, Postgre SQL 4-For ANALYZING DATA :-  Excel, SQL, Spreadsheets, MS SQL, Postgre SQL, Python, R 5-For PRESENTING DATA :- PowerBi, Tableau. We use these tools in different sets. For example: Step 1- We  use Python to scrape data online or mostly we get data from company and other outer ...