Pablo Lamelas –
In TCT 2019 two trials testing different TAVR designs were presented: SCOPE I (ACURATE neo vs Sapien 3) and Portico IDE trial (Portico vs other devices available). Although they add more information to this under-explored field, both studies have substantial methodological aspects making their interpretation very challenging.
SCOPE I summary: ACURATE neo Vs Sapien 3
A total of 739 patients with symptomatic severe aortic stenosis were randomized to receive ACURATE neo or Sapien 3. The primary endpoint was a combination of safety and efficacy based on VARC-2, including mortality, stroke, life-threatening bleeding, major vascular complications, coronary obstruction needing intervention, acute kidney injury, re-hospitalization, repeated valvular procedure or valve related dysfunction (moderate or severe paravalvular leak, and/or stenosis). Was a non-inferiority design with a margin of 7.7% (assuming an even rate of 22%) as intention-to-treat.
ACCURATE neo was associated with more procedure time and contrast use. The 30-day combined primary endpoint resulted in 23% for ACURATE neo and 16.5% for Sapien 3, not meeting criteria for non-inferiority (non-inferiority p-value = 0.42, inferiority p-value = 0.016), mainly driven by paravalvular leak and acute kidney injury. No differences were observed in permanent pacemaker implantation, but ACURATE neo was inferior for para-valvular leak, and superior for effective orifice area and valve gradients. Authors conclude that non-inferiority was not met for the primary composite outcome, ACURATE neo was inferior to Sapien 3 for para-valvular leak and acute kidney injury but superior for gradients and effective orifice area, while other patient important outcomes like death or stroke remained not different.
Intention-to-treat in non-inferiority trials
I think most of methodologists will agree that intention-to-treat (whether is principle or analysis) is not the preferred way to analyze non-inferiority trials, at least as primary analysis. Intention-to-treat does not take into account crossovers in active comparator trials like this one and, if they happen frequently, they can bias towards the null. Crossover bias is an issue in any trial, but specially in non-inferiority ones, in which finding no-difference is considered a success. Does this change this trial interpretation? Well, to be fair, most of the patients received the assigned treatment, so unlikely to change results dramatically, but still important point to consider when reading it.
Primary composite outcome
The primary outcome was a long list of nine heterogeneous outcomes, and worth mentioning all the aspects that has been mixed in this paragraph
- 1) Combined patient-important outcomes (like death or stroke) as well as surrogate outcomes (like residual para valvular leak or valvular gradients)
- 2) Among patient-important outcomes, there were out of proportion from physician and patient perspectives or preferences, including death or stroke (more important ones) with rehospitalization or major bleeding (less important ones)
- 3) frequency of patient-important outcomes ranged from an expected rate of 1 to 2% for death and up to 10 to 15% for rehospitalizations
- 4) Among surrogate outcomes, debatable equivalency in terms of impact in other patient-important outcomes (moderate para-valvular leak, vs acute kidney injury, vs mean gradient over 20, vs two-valve procedure, vs coronary compromise etc)
- 5) The summary statistic of valve dysfunction (moderate to severe para valvular leak and mean gradient over 20) going in different directions: paravalvular leak favours balloon-expandable and gradient favor self-expandable supra annular hemodynamics
- 6) Efficacy (mean valvular gradients) combined with safety (death, vascular complications, etc)
- 7) Last but not the least, this composite outcome is composed by nine elements, increasing finding differences by chance.
All the seven limitations mentioned in the prior paragraph makes this primary composite outcome challenging to interpret. Before knowing the result, it is pretty clear that whatever result you get, you just cannot interpret it in a meaningful way. And think about the non-inferiority margin: what is a clinically important difference? Not the same a 5% difference driven by death, stroke and rehospitalization than 5% driven by moderate paravalvular leak or acute kidney injury.
My take-away from SCOPE I
ACURATE neo was inferior to Sapien 3 in terms of paravalvular leak. This aspect needs to be improved by patient selection: 90% had mild or less of para valvular leak, so long-term outcomes likely not be affected by paravalvular leak in 90% of the eligible patients. Technology already has been implemented with ACURATE neo 2 with sealing skirt on some markets already available. However, both aspects (better patient selection and new technology) need data before jumping soon into new conclusions, in charge of ongoing trials at the moment.
The increased acute kidney injury rate observed were also aligned with more pre- and post-dilatation need and contrast use, therefore making this finding likely to be real. Despite of being a surrogate outcome, efforts should be taken to reduce kidney injury which has been consistently associated with worse outcomes in many scenarios of cardiovascular procedures. For instance, making a better and efficient pre dilatation likely to reduce post-dilatations, and doing less angiography (relying more on pigtail position during positioning, using reduced contrast doses, and reduce aortograms for final result) may have a role in reducing this impact on kidney disfunction.
On the other hand, the increase in procedure time and contrast maybe part of the learning curve, which was more plateaued for Sapien 3 (and its older generation Sapien XT) than rather a completely new prosthesis and delivery system/technique. Also note that, although pre-dilatation and post-dilatation was infrequent in Sapien 3, consider that implant of Sapien 3 requires a longer and very effective rapid pacing already.
Important to notice a similar rate of permanent pacemaker implant between both devices. This may be explained by less radial force of ACURATE neo inflow vs other self-expandable valves in the market. Another possible explanation is that indeed ACURATE neo does cause higher rate of pacemakers than Sapien 3 and not detected because of small difference with limited statistical power in this trial. If the latter is true, then we can be quite certain that we are not talking about a large difference. More data is needed to really understand if there is a difference in this outcome.
Last but not the least, aortic valve area and effective orifice area were markedly superior with ACURATE neo. Is this 4 mmHg difference in median gradient favoring ACURATE neo a clinically important difference? From surgical valve replacement literature, we know that baseline effective orifice area (alone or adjusted by body surface area, called patient-prosthesis mismatch) correlates with clinical outcomes (survival, rehospitalizations, atrial fibrillation, etc) and valve durability. Reporting medians/means may make us feel that is not a large difference, but in the EVOLUT low risk trial there was a “modest” 30-day 2 mmHg mean gradient difference favouring TAVR but was translated to an 80% reduction (1.8% vs 8.2%) in severe aortic patient-prosthesis mismatch at one year. On the other hand, Sapien 3 was inferior to surgical replacement by 1 mmHg in PARTNER 3 . Although makes sense (and supported with prior surgical replacement long-term literature) that better effective orifice area will be translated in better durability, this aspect still unknown in the TAVR literature. Given this outstanding hemodynamic performance, this valve looks a great option for small annulus or small annulus relative to body size.
Portico IDE trial summary
A total of 750 patients with severe aortic stenosis at high risk for cardiac surgery were randomized to receive Portico or other commercially available TAVRs (Sapien XT, Sapien 3 [66% were balloon-expandable], Corevalve, Evolut R or Evolut Pro). It was a non-inferiority design with a pre-specified margin of 8.1% assuming an event rate of 30%.
The study claimed non-inferiority (p = 0.03) based on the primary safety outcome (mortality, disabling stroke, major vascular complication, life-threatening bleeding, renal failure with dialysis) as intention-to-treat, and results remained similar in as-treated and per-protocol analyses. Portico was inferior for vascular complications (which improved in a latter non-randomized cohort of the FlexNav system), moderate paravalvular leak and pacemaker insertion. Hemodynamics (gradients and effective orifice areas) behaved similar to other available supra-annular prosthesis and superior to Sapien 3. Authors conclude meeting non-inferiority for safety and effectiveness, mayor vascular complications with evidence of a learning curve and better hemodynamics.
This is an Investigational Device Exemption (IDE) trial, not supposed to be an explanatory trial trying to get the best point estimate of safety and efficacy relative to a specific stablished gold standard. Then, the comparator group (commercially available valves) was on the pragmatic side of methodological designs: Portico was compared to what is available in the market. Although this give us an idea in how this new valve performs, the heterogeneity in the comparator makes data interpretation challenging.
This is an example: investigators claim better effective orifice area and gradients than “commercially available valves” when this finding is mainly driven by balloon-expandable prosthesis (66%), and not suggesting superiority vs other self-expandable prosthesis. In other words, if this study included 90% self-expandable valves as comparator, then authors unlikely to claim superiority in valve hemodynamics. Self- vs balloon-expandable have other known differences, like paravalvular leak, pacemaker rate, coronary compromise, valve malpositioning, annular rupture, etc.
As a consequence, the fact of having an heterogeneous comparator gives us a an idea in how this valves performs in “real practice” (pragmatic aspect of this trial) where there were two dominant prosthesis in that long time span (Edwards and Medtronic family), but cannot tell with precision about the true performance vs a constant stablished comparator (explanatory comparator methodology).
Wide non-expected non-inferiority margin
Authors stablished an absolute risk difference of 8.5% as non-inferiority margin expecting a 30% event rate in the control arm. Therefore, assuming the study did find the expected 30% risk in the comparator group, if the upper 95% CI crossed 38.5%, Portico would not reach criteria for non-inferiority (a “negative” trial result in this case). This prespecified 8.5% increase in events corresponds of 28% relative increase (relative risk increase of 1.28).
But this is what happened at the end: the actual control group event rate was 9.6% (less than a third of the expected 30% risk). Then, fixing an 8.5% absolute risk as non-inferiority margin, this now corresponds to almost double (relative increase in risk of 1.9). In other words, Portico could be close to twice times worse (meaning the upper 95% CI) than the comparator and still be considered non-inferior. This happened because the non-inferiority margin was fixed as an absolute risk percent, and having a third of the expected events made that absolute margin represent a higher proportion of the observed events at the end of the study.
Solution for this? Now, post-trial, my approach is to interpret the upper 95% CI and see if that relative increase in events would be acceptable (judgment needed). This 95% upper CI for Portico was 8.1%, which is indeed close to double risk compared to commercially available TAVRs of 9.2% (relative risk roughly 1.9). To my eye, this is not acceptable for the primary outcome which included patient-important outcomes only. That means that Portico is inferior to commercially available TAVRs? Of course not, this means that this study does not rule out clinically-important differences (for instance, relative risk of 1.9) given its design and event rate, therefore difficult to be certain about a true non-inferiority.
My take-away from Portico IDE trial
Similar to ACURATE neo, operators likely to be more familiar with the comparator devices which been available in the market from longer time (Sapien XT and Sapien 3, Corevalve, Evolut R and Pro) in which newer generations shared many features will older versions. So, both new valves (ACURATE neo and Portico) likely to perform better as practice becomes more frequent reaching a practice-events plateau making them more comparable to the other commercially available devices.
Aortic valve gradients and effective orifice area of Portico was a surprise to my eye, to see that an intra annular prosthesis performed similar to supra-annular prosthesis. In terms of prior concerns of valve leaflet motion issues, the presenting author explained to the audience that a new CT study is ongoing to better clarify this issue.
Portico was inferior for paravalvular leak, pacemaker insertion, and vascular complications in the randomized comparison vs other commercially available TAVRs. These aspects seemed to improve as experience incremented in the FlexNav cohort, suggesting improvements in technology and experience with the device. Still the FlexNav cohort is more contemporary and not randomized, making interpretation challenging to compare with the randomized cohort.
Final thoughts of new devices
A weak study design is not the solution for small trials. If investigators have enough funding, resources, or time to run a properly powered trial for patient important outcomes encompassing a meaningful primary outcome, then that trial alone may bring light to the research community, by itself. If not, maybe consider not having a primary outcome at all in your study, and just report the outcomes separately and let meta-analysis (which will include the results of that trial) and the rest of the body of the evidence answer better this question.
In the SCOPE I trial you can perfectly see that authors tried to emphasize the fact that no mayor differences in patient-important outcomes were observed, with statistical power limitations acknowledged. Still, the primary outcome (in the case of SCOPE I a long heterogeneous list of outcomes) drives the main conclusion interpretation for the scientific community, interventionalists and journals.
As a multi-prosthesis operator, I acknowledge the superiority of Sapien 3 for para-valvular leak vs the first generation of ACURATE neo, which needs improvement in patient selection and technology as highlighted above. On top of paravalvular leak, I also acknowledge the superiority of Sapient 3 on pacemaker vs Portico, given its self-expanding technology with high radial force in the inflow. But I also acknowledge the superiority of these two prosthesis in valvular gradientes vs Sapien 3, something that may not be very important the day after the procedure, but many years after… an important factor to be considered in young patients.
A tip for Americans starting their experience with ACURATE neo: this is a great device when 1) selection has been judicious (focus on sizing [between sizes and specially when close to the 27 mm of perimeter derived diameter] and paravalvular leak risk), 2) procedural steps are followed as recommended, with emphasis in a good and effective pre-dilatation and 3) in special scenarios of small annulus with excellent post-procedure gradients and small coronary height with remarkable coronary ostia preservation this maybe the preferred valve over others. Hopefully new data brings more light to these emerging technologies for the wellbeing of our patients.