Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

November 17, 2021

Top 20 Apache Ambari interview Questions & Answers

  

Ques: 1). Describe Apache Ambari's main characteristics.

Answer:

Apache Ambari is an Apache product that was created with the goal of making Hadoop applications easier to manage. Ambari assists in the management of the Hadoop project.

  • Provisioning is simple.
  • Project management made simple
  • Monitoring of Hadoop clusters
  • Availability of a user-friendly interface
  • Hadoop management web UI
  • RESTful API support

 Apache Tapestry Interview Questions and Answers

Ques: 2). Why do you believe Apache Ambari has a bright future?

Answer:

With the growing need for big data technologies like Hadoop, we've witnessed a surge in data analysis, resulting in gigantic clusters. Companies are turning to technologies like Apache Ambari for better cluster management, increased operational efficiency, and increased visibility. Furthermore, we've noted how HortonWorks, a technology titan, is working on Ambari to make it more scalable. As a result, learning Hadoop as well as technologies like Apache Ambari is advantageous.

 Apache NiFi Interview Questions & Answers

Ques: 3). What are the core benefits for Hadoop users by using Apache Ambari?

Answer: 

The Apache Ambari is a great gift for individuals who use Hadoop in their day to day work life. With the use of Ambari, Hadoop users will get the core benefits:

1. The installation process is simplified
2. Configuration and overall management is simplified
3. It has a centralized security setup process
4. It gives out full visibility in terms of Cluster health
5. It is extensively extendable and has an option to customize if needed.

 Apache Spark Interview Questions & Answers

Ques: 4). What Are The Checks That Should Be Done Before Deploying A Hadoop Instance?

Answer:

Before actually deploying the Hadoop instance, the following checklist should be completed:

  • Check for existing installations
  • Set up passwordless SSH
  • Enable NTP on the clusters
  • Check for DNS
  • Disable the SELinux
  • Disable iptables

 Apache Hive Interview Questions & Answers

Ques: 5 As a Hadoop user or system administrator, why should you choose Apache Ambari?

Answer:

Using Apache Ambari can provide a Hadoop user with a number of advantages.

A system administrator can use Ambari to – Install Hadoop across any number of hosts using a step-by-step guide supplied by Ambari, while Ambari handles Hadoop installation setup.

Using Ambari, centrally administer Hadoop services across the cluster.

Using the Ambari metrics system, efficiently monitor the state and health of a Hadoop cluster. Furthermore, the Ambari alert framework sends out timely notifications for any system difficulties, like as disc space issues or node status.

 

Ques: 6). Can you explain Apache Ambari architecture?

Answer:

Apache Ambari consists of following major components-

  • Ambari Server
  • Ambari Agent
  • Ambari Web

Apache Ambari Architecture

The all metadata is handled by the Ambari server, which is made up of a Postgres database instance as indicated in the diagram. The Ambari agent is installed on each computer in the cluster, and the Ambari server manages each host through it.

An Ambari agent is a member of the host that delivers heartbeats from the nodes to the Ambari server, as well as numerous operational metrics, to determine the nodes' health condition.

Ambari Web UI is a client-side JavaScript application that performs cluster operations by regularly accessing the Ambari RESTful API. Furthermore, using the RESTful API, it facilitates asynchronous communication between the application and the server.

 

Ques: 7). Apache Ambari supports how many layers of Hadoop components, and what are they?

Answer: 

Apache Ambari supports three tiers of Hadoop components, which are as follows:

1. Hadoop core components

  • Hadoop Distributed File System (HDFS)
  • MapReduce

2. Essential Hadoop components

  • Apache Pig
  • Apache Hive
  • Apache HCatalog
  • WebHCat
  • Apache HBase
  • Apache ZooKeeper

3. Components of Hadoop support

  • Apache Oozie
  • Apache Sqoop
  • Ganglia
  • Nagios

 

Ques: 8). What different sorts of Ambari repositories are there?

Answer: 

Ambari Repositories are divided into four categories, as below:

  1. Ambari: Ambari server, monitoring software packages, and Ambari agent are all stored in this repository.
  2. HDP-UTILS: The Ambari and HDP utility packages are stored in this repository.
  3. HDP: Hadoop Stack packages are stored in this repository.
  4. EPEL (Enterprise Linux Extra Packages): The Enterprise Linux repository now includes an extra set of software.

 

Ques: 9). How can I manually set up a local repository?

Answer:

When there is no active internet connection available, this technique is used. Please follow the instructions below to create a local repository:

1. First and foremost, create an Apache httpd host.
2. Download a Tarball copy of each repository's entire contents.
3. After it has been downloaded, the contents must be extracted.

 

Ques: 10). What is a local repository, and when are you going to utilise one?

Answer:

A local repository is a hosted place for Ambari software packages in the local environment. When the enterprise clusters have no or limited outbound Internet access, this is the method of choice.

 

Ques: 11). What are the benefits of setting up a local repository?

Answer: 

First and foremost by setting up a local repository, you can access Ambari software packages without internet access. Along with that, you can achieve benefits like –

Enhanced governance with better installation performance

Routine post-installation cluster operations like service start and restart operations

 

Ques: 12). What are the new additions in Ambari 2.6 versions?

Answer:

Ambari 2.6.2 added the following features:

  • It will protect Zeppelin Notebook SSL credentials
  • We can set appropriate HTTP headers to use Cloud Object Stores with HDP
  • Ambari 2.6.1 added the following feature:
  • Conditional Installation of  LZO packages through Ambari
  • Ambari 2.6.0 added the following features:
  • Distributed mode of Ambari Metrics System’s (AMS) along with multiple Collectors
  • Host Recovery improvements for the restart
  • moving masters with minimum impact and scale testing
  • Improvement in Data Archival & Purging in Ambari Infra

 

Ques: 13). List Out The Commands That Are Used To Start, Check The Progress And Stop The Ambari Server?

Answer :

The following are the commands that are used to do the following activities:

To start the Ambari server

ambari-server start

To check the Ambari server processes

ps -ef | grep Ambari

To stop the Ambari server

ambari-server stop

 

Ques: 14). What all tasks you can perform for managing host using Ambari host tab?

Answer: 

Using Hosts tab, we can perform the following tasks:

  • Analysing Host Status
  • Searching the Hosts Page
  • Performing Host related Actions
  • Managing Host Components
  • Decommissioning a Master node or Slave node
  • Deleting a Component
  • Setting up Maintenance Mode
  • Adding or removing Hosts to a Cluster
  • Establishing Rack Awareness

 

Ques: 15). What all tasks you can perform for managing services using Ambari service tab?

Answer: 

Using Services tab, we can perform the following tasks:

  • Start and Stop of All Services
  • Display of Service Operating Summary
  • Adding a Service
  • Configuration Settings change
  • Performing Service Actions
  • Rolling Restarts
  • Background Operations monitoring
  • Service removal
  • Auditing operations
  • Using Quick Links
  • YARN Capacity Scheduler refresh
  • HDFS management
  • Atlas management in a Storm Environment

 

Ques: 16). Is there a relationship between the amount of free RAM and disc space required and the number of HDP cluster nodes?

Answer: 

Without a doubt, it has. The amount of RAM and disc required depends on the number of nodes in your cluster. In typically, 1 GB of memory and 10 GB of disc space are required for each node. Similarly, for a 100-node cluster, 4GB of memory and 100GB of disc space are required. To get all of the details, you'll need to look at a specific version.

 

Ques: 17). What tasks you can skill for managing services using the Ambari subsidiary bank account?

Answer: 

using the Services report, we can do the bearing in mind tasks:

  • Start and Stop of All Services
  • Display of Service Operating Summary
  • Adding a Service
  • Configuration Settings regulate
  • Performing Service Actions
  • Rolling Restarts
  • Background Operations monitoring
  • Service removal
  • Auditing operations
  • Using Quick Links
  • YARN Capacity Scheduler refresh
  • HDFS presidency
  • Atlas approach in a Storm Environment

 

Ques: 18). What is the best method for installing the Ambari agent on all 1000 hosts in the HDP cluster?

Answer: 

Because the cluster contains 1000 nodes, we should not manually install the Ambari agent on each node. Instead, we should set up a password-less ssh connection between the Ambari host and all of the cluster's nodes. To remotely access and install the Ambari Agent, Ambari Server hosts employ SSH public key authentication.

 

Ques: 19). What can I do if I have admin capabilities in Ambari?

Answer: 

Becoming a Hadoop Administrator is a difficult job. On HadoopExam.com, you can find all of the available Hadoop Admin training for HDP, Cloudera, and other platforms (visit now). You can create a cluster, manage the users in that cluster, and create groups if you are an Ambari Admin. All of these permissions are granted to the default admin user. You can grant the same or different permissions to another user even if you are an Amabari administrator.

 

Ques: 20).  How is recovery achieved in Ambari?

Answer:

Recovery happens in Ambari in the moreover ways:

Based in remarks to activities

In Ambari after a restart master checks for pending undertakings and reschedules them previously all assimilation out is persisted here. Also, the master rebuilds the come clean machines at the back there is a restart, as the cluster market is persisted in the database. While lawsuit beautifies master actually catastrophe in the in front recording their take keep busy, along amid there is a race condition. The events, on the other hand, should be idempotent, which is a unique consideration. And the master restarts any behavior that has not been marked as occurring or has failed in the database. These persistent behaviors are seen in Redo Logs.

Based approaching the desired make known

While the master attempts to make the cluster flesh and blood publicise, you will be encircled by more to in as per the intended freshen appendix, as the master persists in the desired own going in savings account to for of the cluster.



Top 20 Edge Computing Interview Questions & Answers


Ques: 1). What is edge computing, and how does it work?

Answer:

With the passage of time, technology tends to become smaller and faster. As a result, previously "dumb" items such as light bulbs and door locks can now contain modest CPUs and RAM. They can perform calculations and provide information on usage. This computing enables analytics to be performed at the network's most granular levels, often known as the edge.

Edge computing puts processing power closer to the end user or the data source. In practise, this implies relocating computation and storage from the cloud to a local location, such as an edge server. Read more about edge computing in our overview.

 

Ques: 2). Is the footprint of this edge service appropriate for my needs?

Answer:

Different edge computing applications may have drastically different needs for geographic coverage and proximity. Consider the requirements of your project. Edge computer nodes could be located within or near each factory, but only for a limited number of locations.

The creator of an augmented reality programme that customers can use in stores to get real-time product ratings and pricing comparisons might want edge nodes on every street corner, or as near to that as possible.

 

Ques: 3). Why Edge Computing?

Answer:

This technique optimises bandwidth efficiency by analysing data at the edge, as opposed to the cloud, which requires data transfer from the IoT, which requires high bandwidth, making it beneficial for application in remote locations at low cost. It enables smart applications and devices to react to data practically simultaneously, which is critical in business and self-driving automobiles. It can process data without putting it on a public cloud, which assures complete security.

While on an extended network, data may become corrupt, compromising the data's dependability for companies to use. The utilisation of cloud computing is limited by data computation at the edge.

 

Ques: 4).  What are the main Key Benefits and services Of Edge Computing?

Answer:

  • Faster response time.
  • Security and Compliance.
  • Cost-effective Solution.
  • Reliable Operation With Intermittent Connectivity.

Edge Cloud Computing Services:

  • IOT (Internet Of Things)
  • Gaming
  • Health Care
  • Smart City
  • Intelligent Transportation
  • Enterprise Security

 

Ques: 5). Is there really a need for that much computation at the edge?

Answer:

Another way to phrase this question is: Which data-intensive tasks would benefit the most from network offloading? Not all applications will be eligible, and many will require data aggregation that is beyond the capability of local computing. Look for situations where processing data closer to the consumer or data source would be more efficient. These three, according to Steven Carlini, are the best prospects for edge computing.

 

Ques: 6). Is there really a need for that much computation at the edge?

Answer:

Another way to phrase this question is: Which data-intensive tasks would benefit the most from network offloading? Not all applications will be eligible, and many will require data aggregation that is beyond the capability of local computing. Look for situations where processing data closer to the consumer or data source would be more efficient. These three, according to Steven Carlini, are the best prospects for edge computing.

 

Ques: 7). How much storage should be available at the edge?

Answer:

Large volumes of data that would have been saved in the cloud will now be stored locally thanks to edge computing. While storage technology is inexpensive, management costs are not. Will the cost of keeping and managing device data at the edge justify the move? How will edge devices be protected?

Processing data at the edge, rather than uploading raw data to the cloud, may be a better way to secure user privacy. Edge computing's dispersed nature, on the other hand, renders intelligent edge devices more susceptible to malware outbreaks and security breaches.

 

Ques: 8). Why is it important to concentrate on edge computing right now?

Answer:

Edge is ripe now, thanks to new technology and demand for new applications. Consumers seek reduced latency for content-driven experiences, while businesses need local processing for security and redundancy. If you're interested in learning more about where edge computing is going, check out our article on the future of edge computing.

 

Ques: 9). What kind of apps, services, or business strategies would your edge computing platform deliver?

Answer:

Determine which workloads should run on the edge rather than in a central location, if you haven't previously, says Yugal Joshi, vice president of Everest Group. IT leaders should also look into whether any existing initiatives (such as IoT or AI) could benefit from edge processing.

 

Ques: 10). Will the organization's operating model have to alter as a result of edge computing?

Answer:

The usage of edge computing to support operational technologies is common. In such circumstances, technology leaders must determine who will own and manage the edge environment, whether greater alignment between the operating and information technology groups is required, and how performance will be monitored.

 

Ques: 11). What is the distinction between edge computing, cloud computing, and fog computing?

Answer:

Data collection, storage, and calculation are all done on edge devices in edge computing.

Cloud computing is the storing and computation of data on servers that are primarily more powerful and connected to edge devices. The edge devices transfer their data across the network to the cloud, where it is processed by a more sophisticated system.

Fog computing is a hybrid of the two approaches. The cloud servers are sometimes too far away from the edge devices for data analytics to happen quickly enough. As a result, a fog computing intermediary device is set up as a hub between two fog computing devices. This device does the computation and analytics required by the edge device.

 

Ques: 12). What role does a database play in edge computing?

Answer:

A device on the edge must be able to store and manage the data it generates efficiently. These devices have very little CPU and storage space, and they may power cycle frequently and unpredictably. A database system is the only means to store and use data in a secure manner. Additionally, the data may need to be easily transferred to a cloud system or accessed from a remote location. A database system  with SymmetricDS can provide a developer with a simple set of APIs to accomplish this.

 

Ques: 13). What is the sturdiness of this edge solution? How will the edge provider ensure that the application recovers if it fails?

Answer:

As businesses move beyond experimenting with edge computing to leveraging it for more significant applications, questions like these will become increasingly essential. To mitigate the risks of the innovative edge components, IT architects will want to use tried-and-true technologies whenever possible. Service level agreements and quality of service guarantees are important to business leaders. Even so, there will be setbacks.

 

Ques: 14). What is our long-term plan for managing edge resources?

Answer:

It's difficult enough to manage network and computer resources that are split between company data centres and the cloud. The difficulty could be amplified with edge computing.

You should inquire about what systems management resources an edge service provider provides, as well as how well-known systems management software vendors are addressing the unique aspects of edge computing.

There's also the issue of labour division: how much control will the enterprise have over how software is deployed to and updated on edge nodes? How much of that will it entrust to a third-party service provider?Will the enterprise even have the option of exercising control over the management of cloud nodes, or will the service provider consider that its own business?

 

Ques: 15). What safeguards do we have in place to avoid becoming enslaved to this cutting-edge solution?

Answer:

For the most part, open source software and open standards have prevailed in the cloud, and they're likely to win on the edge for the same reasons, according to Drobot. Open internet technologies are the most adaptable and portable, making them popular among clients as well as cloud providers who need to improve their solutions on a regular basis. He predicts that the same dynamics will apply to edge computing. The biggest exceptions so far have been related to edge computing resource metering and billing technology. Technology for managing edge computing that is specific to a particular vendor’s environment could make it harder to move your applications elsewhere.

 

Ques: 16). How might edge computing aid in the real-time visualisation of my business?

Answer:

Because data is handled in parallel across several edge nodes, edge computing allows industrial data to be processed more efficiently. Furthermore, because data is computed at the edge, delay from a round trip across the local network, to the cloud, and back is not required. Edge computing is hence well suited for real-time applications. Edge computing can assist in the prevention of equipment failure by detecting and forecasting when faults may occur, allowing operators to respond earlier. Real-time KPIs can provide decision makers with a complete picture of their system's state. Identifying which information is most valuable to receive in real-time can scope edge computing projects to focus on what’s important.

 

Ques: 17). How can I put Machine Learning to work at the edge?

Answer:

At the edge, machine learning algorithms can reduce raw sensor data by removing duplicates and other noise. Machine Learning can greatly reduce the amount of data that has to be transferred over local networks or kept in the cloud or other database systems by identifying useful information and discarding the rest. Machine learning in edge installations ensures cheaper running costs and more efficient operation of downstream applications.

 

Ques: 18). Where do you see possibilities for integrating with existing systems?

Answer:

According to a survey conducted by IDC Research, 60% of IT workers have five or more analytical databases, and 25% have more than ten. Edge computing allows these external systems to be integrated into a single real-time experience. Edge computing systems can easily consider other systems as new nodes in the system by employing bridges and connections, whereas integration has previously been a big difficulty. As a result, seeing integration opportunities early on can help you get the most out of your edge computing solution.

 

Ques: 19). What kinds of costly incidents may be avoided if I was alerted sooner?

Answer:

Edge computing architectures' real-time advantages can help minimise costly downtime and other unintended consequences. You may more effectively prioritise the desired objectives for your edge computing project by analysing which events can be the most disruptive to your organisation. Edge computing can assist identify the conditions that cause failure in real-time and enable operators to intervene sooner, whether your objective is to reduce downtime, develop an effective predictive maintenance strategy, or ensure that logistical operations are made more efficient.

 

Ques: 20). What can I do to make it more secure?

Answer:

Edge deployments are complicated, as each node adds to the vulnerability surface area. As a result, security planning is vital to the success of any edge computing project. Edge computing enables the encryption of critical data at the point of origin, ensuring an end-to-end security solution. Additional security steps can be taken by separating edge services from the rest of the programme, guaranteeing that even if one node is hacked, the remainder of the application can continue to function normally.



February 10, 2021

Top 20 Oracle BI Publisher Interview Questions and Answers

 

 

Ques. 1): What is the main difference between BI publisher and XML publisher?

Answer: BI publisher can be installed as a standalone version running off of several OC4J compliant engines, such as Application Server and Tomcat. BI publisher can be pointed anywhere, so that reports can be run out of an OLTP or warehouse database, MSSQL and even within EBS. Licensing is already included in EBS, and the standalone costs whatever plus maintenance. XML Pub operates entirely within EBS and can only be used within EBS.


Oracle Fusion Applications interview Questions and Answers


Ques. 2): What is a data template in BI publisher?

Answer: Data template is an xml structure which have the queries to be run for a database so that output can be generated in xml format, this generated xml output is further applied on a layout template for the final required output.


Oracle Accounts Payables Interview Questions and Answers


Ques. 3): What is the default output format of the report in BI publisher?

Answer: The default output format defined during the layout template creation will be used to generate the output, the same can be modified during the request submission and it will overwrite the one defined at layout template.


Oracle ADF Interview Questions and Answers                                 


Ques. 4): What are the various sections in the data template?

Ans: The various sections in the data template in BI publisher are as:

  • Parameter section
  • Lexical Section
  • Trigger Section
  • SQL statement section
  • Data Structure section

Oracle Access Manager Interview Questions and Answers


Ques. 5): In BIP, how do you display the company logo in the report output?

Ans: In BIP, you can just copy and paste the logo (.gif, .jpg or any format) on the header section of .rtf file. Resize the log per the company standards.


Oracle Fusion HCM Interview Questions and Answers


Ques. 6): What is a layout template in BI publisher?

Answer: Layout template defines how the user views the output, basically it can be developed using Microsoft word document in rft (rich text format) or Adobe pdf format. The data output in xml format (from Data template) will be loaded in layout template at run time and the required final output file is generated.


Oracle SCM Interview Questions and Answers


Ques. 7): How do we create subtotals and Grand Total in BI Publisher?

Answer: If I have a report in OBIEE Answers with three columns –

Brand, Type and Revenue.

An OBIEE Report can be created as follows:

  • Create the BI Publisher Report
  • Login into bi publisher through word and open the BIP report
  • Insert the table wizard and Add the columns Brand, Type and Revenue
  • In the Group by – select Brand
  • click the radio button of Group left
  • click on finish.

Oracle Financials Interview questions and Answers


Ques. 8): In how many ways you can display images in a BI Publisher Report?

Answer:
 The images can be displayed in below 5 ways:

  •     Direct Insertion into RTF Template
  •     URL Reference
  •     OA_MEDIA directory reference
  •     Image from BLOB datatype from database
  •     Using UI Beans

Oracle Cloud Interview Questions and Answers


Ques. 9): How to submit a layout in the backend?

Answer: 
we must write a procedure for this using the below code

FND_REQUEST.ADD_LAYOUT (

 TEMPLATE_APPL_NAME     => 'application name',

TEMPLATE_CODE           => 'your template code',

TEMPLATE_LANGUAGE       => 'En',

TEMPLATE_TERRITORY      => 'US',

OUTPUT_FORMAT           => 'PDF'

);

Oracle PL/SQL Interview Questions and Answers


Ques. 10): What are the various XML publisher tables?

Answer:

  • PER_GB_XDO_TEMPLATES
  • XDO_DS_DEFINITIONS_B
  • XDO_DS_DEFINITIONS_TL
  • XDO_DS_DEFINITIONS_VL
  • XDO_LOBS
  • XDO_TEMPLATES_B
  • XDO_TEMPLATES_TL
  • XDO_TEMPLATES_VL
  • XDO_TEMPLATE_FIELDS
  • XDO_TRANS_UNITS
  • XDO_TRANS_UNIT_PROPS
  • XDO_TRANS_UNIT_VALUES

Oracle SQL Interview Questions and Answers


Ques. 11): How to get SYSDATE in the header section dynamically when we run the report?

Answer: You cannot insert form fields in the Header section, but you can just insert the code to achieve this. For example: insert this in the header section to view the sysdate: You could format the date as you would like.

<?xdofx: sysdate(‘YYYY-MM-DD’)?>

Oracle RDMS Interview Questions and Answers


Ques. 12): How do you create a BI Publisher report with two sub reports?

Answer: If I have a report in OBIEE Answers with these columns –

  • Year, Brand, Revenue
  • Region, District, Revenue

An OBIEE Report can be created as follows:

  • Create the BI Publisher Report
  • Login into bi publisher through word and open the BIP report
  • Insert the table wizard
  • Add the columns Year, Brand and Revenue
  • In the Group by select Year and Brand
  • click the radio button of Group left
  • click on finish.
  • delete the ones indicated in circles
  • create a table with 2 rows and 3 columns
  • Give the column headings in the first row
  • copy paste the following indicated by the arrow mark
  • apply the necessary formating and delete the first table
  • give the necessary aggregation as shown below
  • create the other sub report similarly
  • publish the template
  • view it in bi publisher
  • Thus, we can create sub reports in BI Publisher


Oracle 10g Interview Questions and Answers

 

Ques. 13): How to calculate the running total in XMLP?

Answer:

  • <?xdoxslt:set_variable($_XDOCTX, ‘RTotVar’, xdoxslt:get_variable($_XDOCTX, ‘RTotVar’) + ACCTD_AMT(This is column name) )?>
  • <?xdoxslt:get_variable($_XDOCTX, ‘RTotVar’)?>

 

Ques. 14): How to use Variables in XML publisher?

Answer: In XML publisher the declaration of variables can be done as:

Declaring the Variable R and Assigning the Values 4 to R


<?xdoxslt:set_variable($_XDOCTX, ‘R’, 4)?>

Get the Variable value
<?xdoxslt:get_variable($_XDOCTX, ‘R’)?>

This adds 5 to variable R and displays it
<?xdoxslt:set_variable($_XDOCTX, ‘R’, xdoxslt:get_variable($_XDOCTX, ‘R’)+5)?>

This subtracting 2 to varaible R and displays it

<?xdoxslt:set_variable($_XDOCTX, ‘R’, xdoxslt:get_variable($_XDOCTX, ‘R’)-2)?>