Experiment Design Clause Samples

The Experiment Design clause outlines the framework and requirements for planning and conducting experiments within an agreement. It typically specifies the objectives, methodologies, roles, and responsibilities of each party involved in the experimental process, and may include details such as timelines, data collection methods, and criteria for success. By clearly defining these elements, the clause ensures that both parties have a mutual understanding of how the experiment will be carried out, reducing the risk of misunderstandings and disputes over expectations or outcomes.
Experiment Design. Gnome Trader involves several novel game mechanics which can benefit from a high performance underlying network. Played by users during their daily commutes, Gnome Trader requires the network to adapt to the variations of positions and densities of players. Moreover, these moving players request access to large data such as 3D models and expect to experience an uninterrupted gameplay. As a consequence, the requested content should be available close to the players’ locations, and, in some cases, even follow the players as they move in the city. Overall, there are many ways in which the Gnome Trader game mechanics depend on the performance of the network. The experiments presented in this section have a dual goal of assessing the performance of the FLAME platform, and ensuring the game requirements are met to provide the best player experience.
Experiment Design. The goal is to explore how to leverage FLAME media services to enable a user to experience a branching, interactive narrative within a branching city environment as depicted in Figure 7. This section describes several of the system capabilities influencing the design of the validation experiment.
Experiment Design. The main goal of the PMM experiment is to test an adaptive follow-me media streaming services across multiple devices and locations in the Smart City. This goal translates into investigating how the FLAME platform can impact on personalized media distribution with a specific interest in four main aspects: A detailed description of the experiment rationale, narrative storylines, stakeholders and requirements has been provided in Deliverable D3.1: FMI Vision, Use Cases and Scenarios (v1.1) [1]. Starting from that standpoint, the PMM experiment has been designed to cover incrementally up to three scenarios to be deployed in the City of Barcelona with increasing system complexity and number of involved end-users: x Scenario 1: PMM distribution in walking areas in Barcelona, i.e. my screen & preferences follow me from home to my smart hand-held devices to continue media consumption while walking in the Smart City x Scenario 2: PMM on aggregation areas of the Smart City, i.e. my media follow me also in aggregation area (e.g. shop, cafeteria, and mall), and surrogate functions for media distribution are allocated in edge nodes for more users. x Scenario 3: PMM in digital signage posts, i.e. access to media contents from large public events in the Smart City at digital signage posts and swipe them in the personal device. The planned experiment size and key demonstration steps are described in the following Error! Reference source not found. for the three scenarios. Experiment scenario # Scale Key demonstration points 1 Very Small 1-5 users “My screen follows me” from home to smart hand-held devices in the Smart city x User swipes media from a fixed video/audio device at home to personal mobile devices (e.g. tablets, smartphones) and move within the FLAME urban area x The ▇▇▇ application is capable to invoke the FLAME platform APIs to instantiate content caches and media service chains to continue streaming on the move x [Complimentary] My preferences follow me: I can resume my playlist (music or video) from where I paused while I’m on the move 2 Small 10-50 users “My screen follows me” from home to public aggregation areas in the Smart City x User swipes media from a fixed video/audio device at home to personal mobile devices (e.g. tablets, smartphones) and move within the FLAME urban area x User uses public transportation or is in an aggregation area (e.g. shop, cafeteria, mall) and FLAME content caches and service chains are re-allocated to serve him/her Live in...
Experiment Design. At the specified time and date of the experiment, participants were invited to remotely log into a project PC that was connected to the university network using TeamViewer (a software application for remote control, desktop sharing). For this purpose, a Dell Alienware Aurora R8 was set up as follows: IntelⓇ Core™ i7-6700K CPU @4 GHz, 64GB RAM, 2 x GeForce RTX 2080 Super (Base clock: 1650 MHz, 8 Gb of GDDR6 Memory, and 3,072 CUDA cores), running Windows 10 Pro (1904), Visual Studio 2017 version 15.9.17, and Unity version 2018.4.12f1 (LTS). An average internet provider speed of Up = 901.41 Mbps (SD = 41.08), Down = 521.46 (SD = 11.35), and Ping = 1.6 ms (SD = 0.49) were measured at the PC before each session. The study followed a two-step scenario testing strategy. The first scenario involved creating a simulated crowd scene and the second involved retargeting the crowd from the first scene to a semantically similar one. Two crowd simulation scenes were therefore created, each serving as a problem discovery measure for the new crowd simulation tools. This also allowed a comparison of re-targeting time between the two scenes. Users were provided with descriptions for what should be achieved in each scene, along with a brief video tutorial on how to use our tools in the Unity Editor. The tasks in the steps document were a decomposition of how to create the simulated crowd scene. Participants were allowed to ask questions about how to complete these tasks. Specific information on the result they should aim to achieve was also provided. The opportunity to discuss and analyze these procedures with a project researcher post-task ensured that the participants fully understood how the tools worked and could provide an informed evaluation. Participants were allowed a 30-minute break between the creation of each scene. Figure 1: Scenario 1 - Trinity College “Metropolis” Model The first scenario uses the “Metropolis” 3D model [Figure 1]. The participants were asked to construct the scene with the following specifications: 1. Create 6 static groups of agents in the main square of the 3D model, containing 5, 4, 6, 3, 8, and 7 crowd members respectively. Each group should play an idle/chat/phone call/texting animation but not move locations throughout the simulation. 2. Create two crowd members with a walk animation that enter through the main front arch and navigate to a window in front of the “Rubric” game object. They should then trigger a “wash windows” animation. 3. ...
Experiment Design. The experiment began with the assignment of Treatment and Control groups. An in- dividual would be assigned to treatment groups if he received job-related training or self-improvement in skills & knowledge within three months before the survey. The remaining individuals would be automatically assigned to control groups. Figure 3.1 presents the details of eight treatment-control pais. Figure 3.1: Observations in Control Group and Treatment Group (Propensity Score Matching Experiments) The propensity score is then estimated by the “pscore" algorithm in Stata. The logit ap- proach will be used to estimate the propensity score. According to ▇▇▇▇▇▇▇▇ & ▇▇▇▇▇▇▇▇ (2008), the logit approach should be more robust than the multinational probit approach and could also reduce the misspecification. In order to avoid the over-parameterisation (▇▇▇▇▇▇ et al. 2002), this paper followed ▇▇▇▇▇▇▇▇ & ▇▇▇▇▇▇▇’▇ (2001) approach to include at least three variables selected from the following three categories: Personal Information, Family Information, and Education information. Moreover, according to ▇▇▇▇▇▇ & ▇▇▇▇▇▇ (2002), the following two conditions must be met in order to ensure unbiased propensity score estimation: (1) the propensity score must be balanced in each block, and (2) the interval of estimated scores should be sufficient for all propen- sities. Therefore, this paper conducted propensity score estimation on each of the eight Treatment-Control Pairs to generate balanced scores. The details of propensity score estimations and subsequent results are listed in table 3.8. Sub- experiment Target Year Ethnicity Treatment Treatment- Control Pair Pre-treatment Characteristics Range of Propensity Score Pseudo-R2 Number of Blocks 1 2010 White Job-training Pair1 Age, Higher Education indicator, Family macro [0.321,0.544] 0.0156 26 2 2010 Non-white Job-training Pair2 General information macro, Family macro, higher education indicator [0.0467, 0.575] 0.0695 5 3 2010 White Self-improve Pair3 Maternity indicator; Marital status, Higher education indicator, Family macro, head of household indicator [0.1923, 0.5102] 0.0374 5 4 2010 Non-white Self-improve Pair4 Age, gender, marital Status, higher Education Indicator, Maternity leave indicator, Household Children indicator, job-training [0.2016, 0.6006] 0.0426 9 5 2016 White Job-training Pair5 Marital Status, Maternity leave indicator, higher education indicator, Household Children indicator, Skills & Knowledge Improvement [0.087, 0.6...
Experiment Design. Collecting a large enough sample of perceptual ratings to be able to evaluate both inter-rater agreement and human- automated reliability is challenging. For example, for lis- teners to rate 5 recordings requires making judgments of sets of individual features for 5 recordings (1, 2, 3, 4, 5), similarity for 10 different pairs (1 vs. 2, 1 vs. 3, 2 vs. 3, etc.) or 10 different triplets (1 vs. 2 vs. 3, 1 vs. 2 vs. 4, etc.), which takes approximately 30 minutes. How- ever, human judgments for only 5 recordings would not be enough to meaningfully compare with automated al- gorithms. On the other hand, increasing the sample to 10 recordings would require rating 10 sets of features, 45 pairs, and 120 triplets, which is already more than can be collected within the course of a 1-hour experiment, espe- cially when accounting for listener fatigue. If we attempt to spread out the data collection across multiple different participants by having different participants rate different recordings, we lose the ability to compare inter-rater agree- ment between participants. Unfamiliarity, use/absence of reference tracks, and order effects can also affect percep- tion of similarity. To balance the need for enough data to compare both human-human and human-automated agreement, we de- signed an experiment where we divided the set of 30 di- verse recordings previously used to evaluate inter-rater agreement into 6 sets of 5 recordings. For each set, we collected perceptual judgments of all possible features, pairs, and triplets from 10-11 participants per set (total n = 62 participants). The 62 participants were divided into 6 groups, where all members within each group rated the same 5 songs from the 30-song dataset. Each experiment lasted approximately 20-30 minutes and was divided into three blocks: feature evaluation, pair- wise evaluation and triplet (odd-one-out) evaluation. Be- fore beginning the experiment, participants are played a se- ▇▇▇▇ of reference tracks taken from the Cantometrics train- ing tapes in order to familiarize them with the features they would be rating and the types of recordings they would be asked to rate. Participants then evaluate a set of features for each song after listening to each song at least once, after which they performed the triplet and pairwise sim- ilarity tasks. The order of the triplet and pairwise blocks, and the order of songs/combinations within each block was randomized so as to negate order effects, but the feature evaluation...
Experiment Design. We design several experiments to evaluate the accuracy of the robot prototype shown in Figure 25. An ArUco marker [13], [14] is placed on the robot tip, and a camera is used to track the marker's position. The measured position is used as the feedback in the control loop shown in Figure 28. The initial configuration of the robot is set to 𝑙 (0) = 𝑙(0) = 0. 1 m for 𝑖 = 1, 2, 3, 4, and 𝑞 = [0 0 0] m throughout all the experiments. The proportional gain of the controller 𝐾 is set equal to 0.45 𝑁𝑚 as this value were found to achieve the minimum tracking error.
Experiment Design. The study was set up as a 1 by 2 within-participants design. The independent variable was the training, which was either done by the participants themselves or by somebody else (i.e., another participant). Participants participated in one session, divided in two parts, where the first part constituted training the robot to do a cooperative task for handling test samples. In the second part, participants would perform the task with the robot that they trained, or the same robot that was trained by another participant (training from participant 1). The order for the second part was randomised, with six participants starting with the training of themselves, and the other five, with the training of another.
Experiment Design. The videos were assessed through a video HRI study. The study was designed as a 1 by 3 between-participants design with random assignment of participants. The independent variable is the sound condition which has three levels: no sound, a simple beeping sound, or the use of musical utterance. The “no sound” condition serves as a baseline to see to what extent the situational context signals the robot's intent. We also included a condition where the robot would use a simple beeping sound to assess whether the complexity of music improves emotion elicitation or intent communication.
Experiment Design. Definition of the goals Experimental setup