Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

January 28, 2020

Top 20 Internet of Things (IOT) Interview Questions & Answers


Ques: 1. Why does the need of Internet of Things (IoT) arises?

Ans: The real value of the IoT is not about making the lights turn on or off when the vehicle reaches its parking bay, rather the data that the connected devices gather about its users. The data collected later can allow the consumers, businesses, and connected cities to work in a more efficient manner. 

Internet of Things (IoT) is a network of interconnected objects that uses embedded technology to sense and communicate with their internal states or the external environments. IoT is all about connecting the objects (devices) over the Internet and enabling them to interact with us and other objects.


Ques: 2. What are the main components of the Internet of Things (IoT) system?

Ans: IoT has three main components are as follows:

  • Devices embedded with sensors. 
  • Network connectivity 
  • Data storage application/system


Ques: 3. How does IoT differ from the IIoT?

Ans: 
Internet of Things(IOT): In this system, the connection is between devices and gadgets that are used by average consumers in their day-to-day life. Examples could be washing machines, refrigerators, electrical bulbs, etc. These devices are connected to the Internet and work according to the way they are programmed to and prompted to.

Industrial Internet of Things (IIoT): With IIoT, massive industrial items and systems are connected via the Internet to carry out functions without human intervention. The alarm systems in hospitals or the automatic shutting off of switches in industrial plants are examples of IIoT.


Ques: 4. Do you have any relatable examples of the way in which IoT impacts our everyday lives?

Ans: Since IoT’s primary aim is to connect devices and make them function intuitively according to the needs of the situation; they are used in a variety of devices that we use in our daily lives, which help us coordinate functions better. We no longer have to spend extra time to manually handle these devices, which means a lot of time saved.

Since connectivity with other devices is at the heart of the concept of IoT. IoT - enabled devices not only carry out their function in isolation, but also coordinate with others in the network to make work a lot more efficient than earlier devices could. For instance, while a standalone alarm clock can only wake you up, in an IoT ecosystem, based on your early morning traits, IoT could also heat the coffee decoction for you or put the geyser on.


Ques: 15. What importance does cybersecurity have for IoT?

Ans: Cybersecurity is a matter of serious concern wherever the Net is involved. So, with its heavy reliance on the Internet, the IoT is sure to make cybersecurity a very crucial area of its work. 

Corporations and governments will have to be on their toes in anticipating and preventing cyber attacks on the IoT. This calls for a very strongly coordinated, robust response from all the players involved because a disruption in any link in the chain can throw the whole ecosystem out of gear.


Ques: 16. Assess the impact the IoT will have on the healthcare sector

Ans: The IoT is extremely well-suited for the healthcare sector. With its ability to coordinate large volumes of data seamlessly and across devices; the IoT can be a great facilitator for the healthcare industry. At a micro level, it will help to generate a lot of data from the health-conscious user, such as pulse, blood pressure, calorie count, etc. from individual users. At the macro level, it will make tele-medicine a lot more efficient by being able to coordinate data across devices in far-flung locations.


Ques: 17. What do you understand by the Internet of Everything and what are the elements that go into it?

Ans: The Internet of Everything (IoE) can be understood as an extension of the IoT. While the IoT connects devices with one another, IoE aims at connecting devices with people and help them take smart decisions by coordinating with them. These are some of the elements that go into IoE:

Human beings: The IoE facilitates higher and smoother coordination between humans and devices such as social networks, computers, healthcare gadgets, teaching aids, etc.

Procedure: This aspect of the IoE seeks to automate processes that go into most things managed by humans, ranging from handheld devices to large industrial manufacturing processes.

Things: This is the set of devices or things used by humans that connect to the Net and help share information. sensors, actuators, certain types of medical devices or meters can be examples of things in IoE.

Information: Obviously, this is at the center of the IoE, as much as it is in the IoT. Algorithms function smartly to help make sense of the huge loads of data that devices and other things generate.


Ques: 18. What influences will the internet of things (IoT) have on monetary growth?

Ans: Several monetary analyses have anticipated that the IoT will make contributions extensively to monetary growth over the following decade, however, the predictions range substantially in magnitude. The modern-day international iot market has been valued at approximately $2 trillion, with estimates of its predicted fee over the following five to ten years varying from $4 trillion to $11 trillion.


Ques: 19. What are the elements of the Internet of Everything?
Ans: 
Human beings: People will continue to attach through gadgets, like smartphones, computers, and drugs, in addition to social networks, which includes FaceBook and LinkedIn.

Procedure: This includes evolving era, business, organizational and different tactics in order to be wished in an effort to control and, to a massive extent, automate the explosive growth in connections and the resultant accumulation, analysis and conversation of information with a view to be inevitable inside the internet of the entirety.

Things: It consists of many physical objects like sensors, meters, actuators, and other styles of gadgets that can be connected to any item, that are or can be capable of connecting to the community and sharing statistics. These items will sense and supply more facts, respond to manipulate inputs, and provide more statistics to assist human beings and machines make choices.

Information: Today, devices commonly accumulate statistics and circulate it over the internet to a valuable source, in which it is analyzed and processed. Such statistics is predicted to surpass today’s biggest social media data set by means of every other order of importance.


Ques: 20. What is Bluetooth Low Energy (BLE) Protocol for an Internet of Things (IoT)?
Ans: Nokia originally introduced this protocol as wibree in 2006. Additionally known as Bluetooth smart this protocol presents the same range coverage with a good deal of decreased strength intake because of the unique Bluetooth. It has comparable bandwidth with narrow spacing as utilized by Zigbee. Low electricity latency and decrease complexity makeable greater appropriate to include into low-cost microcontrollers.


December 23, 2019

Top 20 Data Science Interview Questions and Answers

 

Ques: 1. What is the difference between Data Science and Data Analytics?

Answer: 

Data Scientists need to slice data to extract valuable insights that a data analyst can apply to real-world business scenarios. The main difference between the two is that the data scientists have more technical knowledge then business analyst. Moreover, they don’t need an understanding of the business required for data visualization.

 

Ques: 2. What is the method to collect and analyse data to use social media to predict the weather condition?

Answer: 

You can collect social media data using Facebook, twitter, Instagram's API's. For example, for the tweeter, we can construct a feature from each tweet like tweeted date, retweets, list of followers, etc. Then you can use a multivariate time series model to predict the weather condition.

 

Ques: 3. What is the Cross-Validation?

Answer: 

It is a model validation technique for evaluating how the outcomes of a statistical analysis will generalize to an independent data set. It is mainly used in backgrounds where the objective is forecast, and one wants to estimate how accurately a model will accomplish in practice. The goal of cross-validation is to term a data set to test the model in the training phase (i.e., validation data set) to limit problems like overfitting and gain insight on how the model will generalize to an independent data set.

 

Ques: 4. What are the Steps in Making a “Decision Tree”?

Answer: 

The steps to make a “Decision Tree” are as follows:

  1. Take the entire data set as input.
  2. Look for a split that maximizes the separation of the classes. A split is any test that divides the data into two sets.
  3. Apply the split to the input data (divide step).
  4. Re-apply steps 1 to 2 to the divided data.
  5. Stop when you meet some stopping criteria.This step is called pruning. 
  6. Clean up the tree if you went too far doing splits.

 

Ques: 5. Can you explain Star Schema?

Answer: 

It is a traditional database schema with a central table. Satellite tables map IDs to physical names or descriptions and can be connected to the central fact table using the ID fields; these tables are known as lookup tables and are principally useful in real-time applications, as they save a lot of memory. Sometimes star schemas involve several layers of summarization to recover information faster.

 

Ques: 6. What are the various steps for a Data analytics project?

Answer: 

The following are important steps involved in an analytics project:

  1. Understand the Business problem. 
  2. Explore the data and study it carefully. 
  3. Prepare the data for modelling by finding missing values and transforming variables. 
  4. Start running the model and analyse the Big data result. 
  5. Validate the model with new data set. 
  6. Implement the model and track the result to analyze the performance of the model for a specific period.


Ques: 7. Why Data Cleansing is essential and which method you use to maintain clean data? Explain.

Answer: 

Dirty data often leads to the incorrect inside, which can damage the prospect of any organization. For example, if you want to run a targeted marketing campaign. However, our data incorrectly tell you that a specific product will be in-demand with your target audience; the campaign will fail.

 

Ques: 8. What is reinforcement learning?

Answer: 

Reinforcement Learning is a learning mechanism about how to map situations to actions. The end result should help you to increase the binary reward signal. In this method, a learner is not told which action to take but instead must discover which action offers a maximum reward. As this method based on the reward/penalty mechanism.

 

Ques: 9. While working on a data set, how can you select important variables? Explain.

Answer: 

Following methods of variable selection you can use:

  • Remove the correlated variables before selecting important variables
  • Use linear regression and select variables which depend on that p values.
  • Use Backward, Forward Selection, and Stepwise Selection
  • Use Xgboost, Random Forest, and plot variable importance chart.
  • Measure information gain for the given set of features and select top n features accordingly.

 

Ques: 10. What cross-validation technique would you use on a time series dataset?

Answer: 

Instead of using k-fold cross-validation, you should be aware to the fact that a time series is not randomly distributed data - It is inherently ordered by chronological order.

In case of time series data, you should use techniques like forward chaining – Where you will be model on past data then look at forward-facing data.

fold 1: training[1], test[2]

fold 1: training[1 2], test[3]

fold 1: training[1 2 3], test[4]

fold 1: training[1 2 3 4], test[5]

 

Ques: 11. What is deep learning?

Answer: 

Deep learning is subfield of machine learning inspired by structure and function of brain called artificial neural network. We have a lot of numbers of algorithms under machine learning like Linear regression, SVM, Neural network etc and deep learning is just an extension of Neural networks. In neural nets we consider small number of hidden layers but when it comes to deep learning algorithms we consider a huge number of hidden layers to better understand the input output relationship.

 

Ques: 12. What is the difference between machine learning and deep learning?

Answer: 

Machine learning:

Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. Machine learning can be categorized in following three categories.

  • Supervised machine learning,
  • Unsupervised machine learning,
  • Reinforcement learning

Deep learning:

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

 

Ques: 13. What is selection bias?

Answer: 

Selection bias is the bias introduced by the selection of individuals, groups or data for analysis in such a way that proper randomization is not achieved, thereby ensuring that the sample obtained is not representative of the population intended to be analysed. It is sometimes referred to as the selection effect. The phrase “selection bias” most often refers to the distortion of a statistical analysis, resulting from the method of collecting samples. If the selection bias is not considered, then some conclusions of the study may not be accurate.

 

Ques: 14. What is TF/IDF vectorization?

Answer: 

TF–IDF is short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used as a weighting factor in information retrieval and text mining. The TF-IDF value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general.

 

Ques: 15. What is the difference between Regression and classification ML techniques?

Answer: 

Both Regression and classification machine learning techniques come under Supervised machine learning algorithms. In Supervised machine learning algorithm, we must train the model using labelled dataset, while training we must explicitly provide the correct labels and algorithm tries to learn the pattern from input to output. If our labels are discreate values then it will a classification problem, e.g A,B etc. but if our labels are continuous values then it will be a regression problem, e.g 1.23, 1.333 etc.

 

Ques: 16. What is p-value?

Answer: 

When you perform a hypothesis test in statistics, a p-value can help you determine the strength of your results. p-value is a number between 0 and 1. Based on the value it will denote the strength of the results. The claim which is on trial is called Null Hypothesis.

Low p-value (≤ 0.05) indicates strength against the null hypothesis which means we can reject the null Hypothesis. High p-value (≥ 0.05) indicates strength for the null hypothesis which means we can accept the null Hypothesis p-value of 0.05 indicates the Hypothesis could go either way. To put it in another way,

High P values: your data are likely with a true null.
Low P values: your data are unlikely with a true null.


Ques: 17. What are the differences between overfitting and underfitting?

Answer: 

In order to make reliable predictions on general untrained data in machine learning and statistics, it is required to fit a (machine learning) model to a set of training data. Overfitting and underfitting are two of the most common modelling errors that occur while doing so.

Following are the differences between overfitting and underfitting:

Definition - A statistical model suffering from overfitting describes some random error or noise in place of the underlying relationship. When underfitting occurs, a statistical model or machine learning algorithm fails in capturing the underlying trend of the data.

Occurrence – When a statistical model or machine learning algorithm is excessively complex, it can result in overfitting. Example of a complex model is one having too many parameters when compared to the total number of observations. Underfitting occurs when trying to fit a linear model to non-linear data.

Poor Predictive Performance – Although both overfitting and underfitting yield poor predictive performance, the way in which each one of them does so is different. While the overfitted model overreacts to minor fluctuations in the training data, the underfit model under-reacts to even bigger fluctuations.

 

Ques: 18. Could you explain the role of data cleaning in data analysis?

Answer: 

Data cleaning can be a daunting task since with the increase in the number of data sources, the time required for cleaning the data increases at an exponential rate.

This is due to the vast volume of data generated by additional sources. Also, data cleaning can solely take up to 80% of the total time required for carrying out a data analysis task.

Nevertheless, there are several reasons for using data cleaning in data analysis. Two of the most important ones are:

  • Cleaning data from different sources helps in transforming the data into a format that is easy to work with. 
  • Data cleaning increases the accuracy of a machine learning model.

 

Ques: 19. Can you explain Recommender Systems along with an application?

Answer: 

Recommender Systems is a subclass of information filtering systems, meant for predicting the preferences or ratings awarded by a user to some product.

An application of a recommender system is the product recommendations section in Amazon. This section contains items based on the user’s search history and past orders.

 

Ques: 20. What is exploding gradients?

Answer: 

“Exploding gradients are a problem where large error gradients accumulate and result in very large updates to neural network model weights during training.” At an extreme, the values of weights can become so large as to overflow and result in NaN values.

This has the effect of your model being unstable and unable to learn from your training data.

Gradient: Gradient is the direction and magnitude calculated during training of a neural network that is used to update the network weights in the right direction and by the right amount.