Practical challenges in computing recall

May 10, 2016

Venkatesh Vinay...

Most experiments are designed on controlled corpus i.e., the precision and recall of the corpus are already known either manually or through some other means (not the same as the experimental tool/automation itself). Thus, these are smaller samples of the real corpus. An Oracle can now be implemented to compute recall. Sampling works in most cases. However, it has its own limitations too. For example, samples can suffer from a serious threat to validity. With another sample, the results could be different. Creating several large samples in several circumstances could be infeasible. Let us review some techniques followed by researchers in these contexts to compute recall.
Gold Sets or Benchmarks
Using an existing benchmark: One way to address this issue is to use a carefully selected representative dataset such as sf100 (http://www.evosuite.org/experimental-data/sf100/). While this is large and unbiased, the issue could be that this is still too large for certain recall computation tasks. Such benchmarks are also referred to as “Gold Set”. Moreover, such benchmarks are rare and specialized that these may not suit your purpose all the time.
Creating your own benchmark: In Shepherd et al.’s paper on “Using Natural Language Program Analysis to Locate and Understand Action-Oriented Concerns”, he hires a new person to prepare the gold set along with relevant results. Another person verifies the results. Both these people discuss and reconcile wherever there were disagreements. This gold set is then released to the community.
Comparative Evaluation instead of Recall
In papers such as “Improving Bug Localization using Structured Information Retrieval”, a comparative result is given instead of recall. They claim that their approach finds x% of bugs more than another tool.
MRR
If there is only one result expected. Computing MRR is more appropriate than Recall. It is easier to do so in top-10 or top-k results.
More on this …soon.

Comments

Nice post

Permalink Submitted by Sangeeth (not verified) on Tue, 09/20/2016 - 14:55

Nice post dude!

Add new comment

Lastest News

Paper accepted at TSE 2023

22 January 2024

Paper titled "Improving Cross-Language Code Clone Detection via Code Representation Learning and Graph Neural Networks" co-authored by Nikita...

Paper accepted at APSEC 2023

22 January 2024

Paper titled "Verifying Exception-Handling Code in Concurrent Libraries" co-authored by Dhriti Khanna, Subodh Sharma and Rahul Purandare has been...

Paper accepted at OOPSLA 2023

28 August 2023

Paper titled "Rapid: Region-based Pointer Disambiguation" co-authored by Khushboo Chitre, Piyus Kedia, and Rahul Purandare has been accepted for...

Paper accepted at ISSTA 2023

31 July 2023

Paper titled "CGuard: Scalable and Precise Object Bounds Protection for C" co-authored by Piyus Kedia, Rahul Purandare, Udit Kumar Agarwal and...

Paper accepted at OOPSLA 2022

3 September 2022

Paper titled "The Road Not Taken: Exploring Alias Analysis Based Optimizations Missed by the Compiler" co-authored by Khushboo Chitre, Piyus Kedia...

Paper accepted at the journal first track of ESEC/FSE 2022

31 August 2022

Paper titled "BiRD: Race Detection in Software Binaries under Relaxed Memory Models" co-authored by Ridhi Jain, Subodh Sharma, and Rahul Purandare...

Paper conditionally accepted at OOPSLA 2022

2 July 2022

Paper titled "The Road Not Taken: Exploring Alias Analysis Based Optimizations Missed by the Compiler" co-authored by Khushboo Chitre, Piyus Kedia...

Paper accepted at RV 2022

2 July 2022

Paper titled "Optimal Finite-State Monitoring of Partial Traces", co-authored by Peeyush Kushwaha, Rahul Purandare, and Matthew Dwyer, has been...

ISEC 2023

12 May 2022

Dr. Rahul Purandare will be co-chairing ISEC 2023 along with Prof. Abhik Roychoudhury (NUS).
Link to the conference website: ...

Paper accepted at Journal First track of ICSE

16 February 2022

Paper titled "Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks", co-authored by Nikita Mehrotra, Navdha Agarwal,...

Practical challenges in computing recall

Comments

Nice post

Pages

Add new comment

Filtered HTML

Plain text

Lastest News

Links to tools created by the group

Course Particular

Interesting External Links

Get in touch with us

Search form

Practical challenges in computing recall

Comments

Nice post

Pages

Add new comment

Filtered HTML

Plain text

Lastest News

User login

Links to tools created by the group

Course Particular

Interesting External Links

Get in touch with us